Published on

Level Up your Code Quality with SonarQube - a Beginners Guide

Authors

SonarQube! The code-scanner that I truely love! In this post we will explain what SonarQube is, what it's reposponsibilities are, why you would need it and a docker compose example of SonarQube with a Python Flask code example that we will use via our Github Actions workflow to test our code.

What is SonarQube?

SonarQube is a open-source platform for continuous inspection of code quality. You can think of it as a code scanner. It analyzes source code and identifies the following:

  • Bugs: Real errors that could cause your application to crash or behave not as expected.
  • Code Smells: Code that works, but it's hard to read, maintain or to build from.
  • Security Vulnerabilities: Potential flaws in your code that hackers might exploit.
  • Code Duplication: Repeated logic in your code that makes code harder to manage or maintain.
  • Test Coverage: How much of your code is tested by unit tests.

SonarQube supports many programming languages, including: Java, Python, Go, JavaScript, etc.

Why use SonarQube?

If you are thinking of using SonarQube, we first need to understand why do we even want to use it in the first place, some use cases include:

Use-CaseDescription
Catch Issues EarlySonarQube helps you spot problems before your code hits production. It’s like having an extra pair of eyes on your code 24/7.
Improve Code QualityIt teaches good coding practices by showing you where your code could be cleaner or more efficient.
Enforce Team StandardsOn a team, SonarQube helps enforce consistent code quality and style, making code reviews smoother and reducing technical debt.
Security FirstFor QA engineers and developers, it's a great way to find potential security vulnerabilities before they become real problems.
Boost LearningFor students and junior developers, SonarQube acts like a personal mentor, explaining why something is a bad practice and how to improve it.

How to SonarQube sits in your workflow

You typically integrate SonarQube into your CI/CD pipeline (Github Actions, Gitlab CI, Jenkins, etc.). The process looks like the following:

  1. Write your code and push it to source control.
  2. SonarQube Scan runs inside your pipeline.
  3. Your pipeline will provide feedback, but you can analyze the results via the SonarQube Web-UI.
  4. Fix the issues reported by SonarQube.

Example use-case using SonarQube

Imagine you are working on a project, that you would like to showcase in a examination. You push your code and SonarQube provides feedback on:

  • Unused imports
  • A method that is hard to read, or code duplication.
  • etc.

Now before you can merge your code to the main branch, you are now aware of these issues, and you can fix them by following SonarQube's feedback. Without SonarQube, you would've been penalized for the small issues.

Example of Code Smells

Code Smells, was something for me that I had to google to understand, and if you are similar to me that was unaware what code smells are, we can look at this smelly python code:

Code Smell Example 1

def process_data(data):
    result = []
    for i in range(len(data)):
        if data[i] != None:
            if data[i] != "":
                result.append(data[i])
    return result

def unused_function():
    print("This function is never used")

def calculate():
    value = 0
    value = value + 10
    value = value + 20
    value = value + 30
    return value

What has been detected:

  • Redundant if checks:
    • Checking if data[i] != None and then if data[i] != "" can be simplified.
    • The pythonic way is: if data[i]
  • Inefficient loop:
    • Looping with range(len(data)) is less readable than looping it directly.
  • Dead code:
    • unused_function() is defined but never called.
  • Repetitive code / lack of abstraction
    • The calculate() function repeats the same pattern and can be simplified or refactored.

The improved version:

def process_data(data):
    return [item for item in data if item]

def calculate():
    return sum([10, 20, 30])

We can already see its way easier to read.

Code Smell Example 2

Let's take a look at a long / hard to read Python function:

def generate_report(data):
    # 1: Filter valid entries
    valid_data = []
    for item in data:
        if 'name' in item and 'score' in item and item['score'] >= 0:
            valid_data.append(item)

    # 2: Sort the data
    valid_data.sort(key=lambda x: x['score'], reverse=True)

    # 3: Calculate average score
    total = 0
    for item in valid_data:
        total += item['score']
    average = total / len(valid_data) if valid_data else 0

    # 4: Generate formatted report
    report = f"Report ({len(valid_data)} entries):\n"
    for item in valid_data:
        report += f"- {item['name']}: {item['score']}\n"
    report += f"Average Score: {average}\n"

    return report

The code smells thats being detected:

  • Does multiple tasks: filtering, sorting, calculating, formatting
  • Hard to test each individual part
  • Cognitive complexity is high

The refactored version which is easier to read and maintain:

def filter_valid_data(data):
    return [item for item in data if 'name' in item and 'score' in item and item['score'] >= 0]

def sort_by_score(data):
    return sorted(data, key=lambda x: x['score'], reverse=True)

def calculate_average(data):
    if not data:
        return 0
    return sum(item['score'] for item in data) / len(data)

def format_report(data, average):
    lines = [f"Report ({len(data)} entries):"]
    lines += [f"- {item['name']}: {item['score']}" for item in data]
    lines.append(f"Average Score: {average}")
    return "\n".join(lines)

def generate_report(data):
    valid_data = filter_valid_data(data)
    sorted_data = sort_by_score(valid_data)
    average = calculate_average(sorted_data)
    return format_report(sorted_data, average)

Why does this even matter you may ask?

  • Each function does one thing, so its easier to test and reuse.
  • Code is more readable for teammates and for yourself in the future.
  • SonarQube's feedback is quick and without emotions.

Motivation for integrating SonarQube into your CI Pipeline

These are some reasons why I think its good to integrate SonarQube into your CI Pipeline:

  1. Automated Code Quality Checks on Every git Commit.
  2. Consistent Code Quality Across the Team.
  3. Catch Bugs and Security Issues Early.
  4. Instant Feedback for Developers.
  5. Supports Shift-Left Testing and Quality Assurance.
  6. Reduces Technical Debt.
  7. Improves Pull Request Workflows.
  8. Education and Onboarding.

Implementing SonarQube with Python and Github Actions

SonarQube on Docker

We will use docker compose to start a instance of SonarQube:

docker-compose.yaml
version: '3.8'

services:
  sonarqube:
    image: sonarqube:lts-community
    hostname: sonarqube
    container_name: sonarqube
    environment:
      SONAR_JDBC_URL: jdbc:postgresql://sonarqube-db:5432/sonar
      SONAR_JDBC_USERNAME: sonar
      SONAR_JDBC_PASSWORD: sonar
    depends_on:
      sonarqube-db:
        condition: service_healthy
    volumes:
      - sonarqube-data:/opt/sonarqube/data
      - sonarqube-extensions:/opt/sonarqube/extensions
    ports:
      - "9000:9000"
    networks:
      - public

  sonarqube-db:
    image: postgres:16
    container_name: sonarqube-db
    hostname: sonarqube-db
    environment:
      - POSTGRES_USER=sonar
      - POSTGRES_PASSWORD=sonar
      - POSTGRES_DB=sonar
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}"]
      interval: 10s
      timeout: 5s
      retries: 5
    volumes:
      - postgresql-data:/var/lib/postgresql/data
    networks:
      - public

volumes:
  sonarqube-data: {}
  sonarqube-extensions: {}
  postgresql-data: {}

networks:
  public:
    name: public

Go ahead and startup the containers:

docker compose up -d

You should be able to login to SonarQube with admin/admin to http://localhost:9000 .

Github Actions Runner

Since SonarQube will be running locally, we will need to run a Github Actions Runner locally as well, so that when we push code to Github, when the Actions Runner runs, the job inside the runner needs to be able to communicate to our SonarQube instance.

If you are running a public Github Actions Runner, the SonarQube instance needs to be running on a public internet space, so that Github Actions can communicate to it.

In this case, we can run Github Actions Runner on docker compose:

version: '3.8'

services:
  github-runner:
    image: myoung34/github-runner:latest
    container_name: github-runner
    environment:
      - REPO_URL=https://github.com/ruanbekker/sonarqube-python-example
      - RUNNER_NAME=sonarqube-runner
      - RUNNER_TOKEN=$GITHUB_RUNNER_TOKEN
      - RUNNER_WORKDIR=/actions
      - RUNNER_SCOPE=repo
      - LABELS=self-hosted,linux,x64
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./actions:/actions
    restart: always

You can follow this post to create a Runner Token, and then we can start the actions runner with docker compose up -d.

Python Application

We will use a Python Flask application with the following directory structure:

├── Dockerfile
├── app
│   ├── __init__.py
│   └── app.py
├── requirements.txt
└── tests
    └── test_app.py

The Dockerfile:

Dockerfile
FROM python:3.9
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "app.py"]

The app/__init__.py is a blank file and the app/app.py:

app.py
from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello, SonarQube!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)

Then tests/test_app.py:

tests/test_app.py
import pytest
from app.app import app

@pytest.fixture
def client():
    return app.test_client()

def test_home(client):
    response = client.get("/")
    assert response.status_code == 200
    assert response.data == b"Hello, SonarQube!"

And the requirements.txt:

requirements.txt
flask
pytest

Configure SonarQube Project

First-off, we would like to create a new project, by selecting "Manually":

Then we can configure the project by providing a "name", "project key" and "branch":

Then we can define how we would analyze the repository, and we will do that with Github Actions:

Then we need to define the secrets (SONAR_HOST_URL and SONAR_TOKEN) that we need to store in Github Actions:

After we stored the Github Actions Secrets, we need to define the Github Actions Workflow inside .github/workflows/build.yml with:

.github/workflows/build.yml
name: SonarQube Analysis

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main

jobs:
  build:
    name: Build, Test, and Analyze
    runs-on: self-hosted
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'

      - name: Cache dependencies
        uses: actions/cache@v3
        with:
          path: ~/.cache/pip
          key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
          restore-keys: |
            ${{ runner.os }}-pip-

      - name: Install Dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
          pip install pytest pytest-cov black flake8

      - name: Run Black (Code Formatter)
        run: black --check .

      - name: Run Flake8 (Linting)
        run: flake8 .

      - name: Run Tests with Coverage
        run: |
          export PYTHONPATH=$(pwd)
          pytest --cov=. --cov-report=xml --junitxml=pytest-report.xml

      - name: Upload Pytest Results
        uses: actions/upload-artifact@v4
        with:
          name: pytest-results
          path: pytest-report.xml

      - name: Upload Coverage Report
        uses: actions/upload-artifact@v4
        with:
          name: coverage-report
          path: coverage.xml

      - name: Run SonarQube Scan
        uses: sonarsource/sonarqube-scan-action@master
        env:
          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
          SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}

      - name: SonarQube Quality Gate Check
        uses: sonarsource/sonarqube-quality-gate-action@master
        timeout-minutes: 5
        env:
          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}

Once we defined our workflow and we commit to our code repository, we can review the overview of our projet results:

It will also show us the amount of lines in our code:

And also we can see our project overview, like bugs, vulnerabilities, code smells, coverage, etc:

Thank You

And there we have implemented SonarQube with Github Actions using a Python Flask application, and now on every single commit to the code repository, we will have instant feedback on code quality.

Thanks for reading, if you like my content, feel free to check out my website, and subscribe to my newsletter or follow me at @ruanbekker on Twitter.

Join my Newsletter?
Buy Me A Coffee