prowler-api

Prowler API patterns: RLS, RBAC, providers, Celery tasks. Trigger: When working in api/ on models/serializers/viewsets/filters/tasks involving tenant isolation (RLS), RBAC, or provider lifecycle.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "prowler-api" with this command: npx skills add prowler-cloud/prowler/prowler-cloud-prowler-prowler-api

When to Use

Use this skill for Prowler-specific patterns:

  • Row-Level Security (RLS) / tenant isolation
  • RBAC permissions and role checks
  • Provider lifecycle and validation
  • Celery tasks with tenant context
  • Multi-database architecture (4-database setup)

For generic DRF patterns (ViewSets, Serializers, Filters, JSON:API), use django-drf skill.


Critical Rules

  • ALWAYS use rls_transaction(tenant_id) when querying outside ViewSet context
  • ALWAYS use get_role() before checking permissions (returns FIRST role only)
  • ALWAYS use @set_tenant then @handle_provider_deletion decorator order
  • ALWAYS use explicit through models for M2M relationships (required for RLS)
  • NEVER access Provider.objects without RLS context in Celery tasks
  • NEVER bypass RLS by using raw SQL or connection.cursor()
  • NEVER use Django's default M2M - RLS requires through models with tenant_id

Note: rls_transaction() accepts both UUID objects and strings - it converts internally via str(value).


Architecture Overview

4-Database Architecture

DatabaseAliasPurposeRLS
defaultprowler_userStandard API queriesYes
adminadminMigrations, auth bypassNo
replicaprowler_userRead-only queriesYes
admin_replicaadminAdmin read replicaNo
# When to use admin (bypasses RLS)
from api.db_router import MainRouter
User.objects.using(MainRouter.admin_db).get(id=user_id)  # Auth lookups

# Standard queries use default (RLS enforced)
Provider.objects.filter(connected=True)  # Requires rls_transaction context

RLS Transaction Flow

Request → Authentication → BaseRLSViewSet.initial()
                                    │
                                    ├─ Extract tenant_id from JWT
                                    ├─ SET api.tenant_id = 'uuid' (PostgreSQL)
                                    └─ All queries now tenant-scoped

Implementation Checklist

When implementing Prowler-specific API features:

#PatternReferenceKey Points
1RLS Modelsapi/rls.pyInherit RowLevelSecurityProtectedModel, add constraint
2RLS Transactionsapi/db_utils.pyUse rls_transaction(tenant_id) context manager
3RBAC Permissionsapi/rbac/permissions.pyget_role(), get_providers(), Permissions enum
4Provider Validationapi/models.pyvalidate_<provider>_uid() methods on Provider model
5Celery Taskstasks/tasks.py, api/decorators.py, config/celery.pyTask definitions, decorators (@set_tenant, @handle_provider_deletion), RLSTask base
6RLS Serializersapi/v1/serializers.pyInherit RLSSerializer to auto-inject tenant_id
7Through Modelsapi/models.pyALL M2M must use explicit through with tenant_id

Full file paths: See references/file-locations.md


Decision Trees

Which Base Model?

Tenant-scoped data       → RowLevelSecurityProtectedModel
Global/shared data       → models.Model + BaseSecurityConstraint (rare)
Partitioned time-series  → PostgresPartitionedModel + RowLevelSecurityProtectedModel
Soft-deletable           → Add is_deleted + ActiveProviderManager

Which Manager?

Normal queries           → Model.objects (excludes deleted)
Include deleted records  → Model.all_objects
Celery task context      → Must use rls_transaction() first

Which Database?

Standard API queries     → default (automatic via ViewSet)
Read-only operations     → replica (automatic for GET in BaseRLSViewSet)
Auth/admin operations    → MainRouter.admin_db
Cross-tenant lookups     → MainRouter.admin_db (use sparingly!)

Celery Task Decorator Order?

@shared_task(base=RLSTask, name="...", queue="...")
@set_tenant                    # First: sets tenant context
@handle_provider_deletion      # Second: handles deleted providers
def my_task(tenant_id, provider_id):
    pass

RLS Model Pattern

from api.rls import RowLevelSecurityProtectedModel, RowLevelSecurityConstraint

class MyModel(RowLevelSecurityProtectedModel):
    # tenant FK inherited from parent
    id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
    name = models.CharField(max_length=255)
    inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
    updated_at = models.DateTimeField(auto_now=True, editable=False)

    class Meta(RowLevelSecurityProtectedModel.Meta):
        db_table = "my_models"
        constraints = [
            RowLevelSecurityConstraint(
                field="tenant_id",
                name="rls_on_%(class)s",
                statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
            ),
        ]

    class JSONAPIMeta:
        resource_name = "my-models"

M2M Relationships (MUST use through models)

class Resource(RowLevelSecurityProtectedModel):
    tags = models.ManyToManyField(
        ResourceTag,
        through="ResourceTagMapping",  # REQUIRED for RLS
    )

class ResourceTagMapping(RowLevelSecurityProtectedModel):
    # Through model MUST have tenant_id for RLS
    resource = models.ForeignKey(Resource, on_delete=models.CASCADE)
    tag = models.ForeignKey(ResourceTag, on_delete=models.CASCADE)

    class Meta:
        constraints = [
            RowLevelSecurityConstraint(
                field="tenant_id",
                name="rls_on_%(class)s",
                statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
            ),
        ]

Async Task Response Pattern (202 Accepted)

For long-running operations, return 202 with task reference:

@action(detail=True, methods=["post"], url_name="connection")
def connection(self, request, pk=None):
    with transaction.atomic():
        task = check_provider_connection_task.delay(
            provider_id=pk, tenant_id=self.request.tenant_id
        )
    prowler_task = Task.objects.get(id=task.id)
    serializer = TaskSerializer(prowler_task)
    return Response(
        data=serializer.data,
        status=status.HTTP_202_ACCEPTED,
        headers={"Content-Location": reverse("task-detail", kwargs={"pk": prowler_task.id})}
    )

Providers (11 Supported)

ProviderUID FormatExample
AWS12 digits123456789012
AzureUUID v4a1b2c3d4-e5f6-...
GCP6-30 chars, lowercase, letter startmy-gcp-project
M365Valid domaincontoso.onmicrosoft.com
Kubernetes2-251 charsarn:aws:eks:...
GitHub1-39 charsmy-org
IaCGit URLhttps://github.com/user/repo.git
Oracle CloudOCID formatocid1.tenancy.oc1..
MongoDB Atlas24-char hex507f1f77bcf86cd799439011
Alibaba Cloud16 digits1234567890123456

Adding new provider: Add to ProviderChoices enum + create validate_<provider>_uid() staticmethod.


RBAC Permissions

PermissionControls
MANAGE_USERSUser CRUD, role assignments
MANAGE_ACCOUNTTenant settings
MANAGE_BILLINGBilling/subscription
MANAGE_PROVIDERSProvider CRUD
MANAGE_INTEGRATIONSIntegration config
MANAGE_SCANSScan execution
UNLIMITED_VISIBILITYSee all providers (bypasses provider_groups)

RBAC Visibility Pattern

def get_queryset(self):
    user_role = get_role(self.request.user)
    if user_role.unlimited_visibility:
        return Model.objects.filter(tenant_id=self.request.tenant_id)
    else:
        # Filter by provider_groups assigned to role
        return Model.objects.filter(provider__in=get_providers(user_role))

Celery Queues

QueuePurpose
scansProwler scan execution
overviewDashboard aggregations (severity, attack surface)
complianceCompliance report generation
integrationsExternal integrations (Jira, S3, Security Hub)
deletionProvider/tenant deletion (async)
backfillHistorical data backfill operations
scan-reportsOutput generation (CSV, JSON, HTML, PDF)

Task Composition (Canvas)

Use Celery's Canvas primitives for complex workflows:

PrimitiveUse For
chain()Sequential execution: A → B → C
group()Parallel execution: A, B, C simultaneously
CombinedChain with nested groups for complex workflows

Note: Use .si() (signature immutable) to prevent result passing. Use .s() if you need to pass results.

Examples: See assets/celery_patterns.py for chain, group, and combined patterns.


Beat Scheduling (Periodic Tasks)

OperationKey Points
Create scheduleIntervalSchedule.objects.get_or_create(every=24, period=HOURS)
Create periodic taskUse task name (not function), kwargs=json.dumps(...)
Delete scheduled taskPeriodicTask.objects.filter(name=...).delete()
Avoid race conditionsUse countdown=5 to wait for DB commit

Examples: See assets/celery_patterns.py for schedule_provider_scan pattern.


Advanced Task Patterns

@set_tenant Behavior

Modetenant_id in kwargstenant_id passed to function
@set_tenant (default)Popped (removed)NO - function doesn't receive it
@set_tenant(keep_tenant=True)Read but keptYES - function receives it

Key Patterns

PatternDescription
bind=TrueAccess self.request.id, self.request.retries
get_task_logger(__name__)Proper logging in Celery tasks
SoftTimeLimitExceededCatch to save progress before hard kill
countdown=30Defer execution by N seconds
eta=datetime(...)Execute at specific time

Examples: See assets/celery_patterns.py for all advanced patterns.


Celery Configuration

SettingValuePurpose
BROKER_VISIBILITY_TIMEOUT86400 (24h)Prevent re-queue for long tasks
CELERY_RESULT_BACKENDdjango-dbStore results in PostgreSQL
CELERY_TASK_TRACK_STARTEDTrueTrack when tasks start
soft_time_limitTask-specificRaises SoftTimeLimitExceeded
time_limitTask-specificHard kill (SIGKILL)

Full config: See assets/celery_patterns.py and actual files at config/celery.py, config/settings/celery.py.


UUIDv7 for Partitioned Tables

Finding and ResourceFindingMapping use UUIDv7 for time-based partitioning:

from uuid6 import uuid7
from api.uuid_utils import uuid7_start, uuid7_end, datetime_to_uuid7

# Partition-aware filtering
start = uuid7_start(datetime_to_uuid7(date_from))
end = uuid7_end(datetime_to_uuid7(date_to), settings.FINDINGS_TABLE_PARTITION_MONTHS)
queryset.filter(id__gte=start, id__lt=end)

Why UUIDv7? Time-ordered UUIDs enable PostgreSQL to prune partitions during range queries.


Batch Operations with RLS

from api.db_utils import batch_delete, create_objects_in_batches, update_objects_in_batches

# Delete in batches (RLS-aware)
batch_delete(tenant_id, queryset, batch_size=1000)

# Bulk create with RLS
create_objects_in_batches(tenant_id, Finding, objects, batch_size=500)

# Bulk update with RLS
update_objects_in_batches(tenant_id, Finding, objects, fields=["status"], batch_size=500)

Security Patterns

Full examples: See assets/security_patterns.py

Tenant Isolation Summary

PatternRule
RLS in ViewSetsAutomatic via BaseRLSViewSet - tenant_id from JWT
RLS in CeleryMUST use @set_tenant + rls_transaction(tenant_id)
Cross-tenant validationDefense-in-depth: verify obj.tenant_id == request.tenant_id
Never trust user inputUse request.tenant_id from JWT, never request.data.get("tenant_id")
Admin DB bypassOnly for cross-tenant admin ops - exposes ALL tenants' data

Celery Task Security Summary

PatternRule
Named tasks onlyNEVER use dynamic task names from user input
Validate argumentsCheck UUID format before database queries
Safe queuingUse transaction.on_commit() to enqueue AFTER commit
Modern retriesUse autoretry_for, retry_backoff, retry_jitter
Time limitsSet soft_time_limit and time_limit to prevent hung tasks
IdempotencyUse update_or_create or idempotency keys

Quick Reference

# Safe task queuing - task only enqueued after transaction commits
with transaction.atomic():
    provider = Provider.objects.create(**data)
    transaction.on_commit(
        lambda: verify_provider_connection.delay(
            tenant_id=str(request.tenant_id),
            provider_id=str(provider.id)
        )
    )

# Modern retry pattern
@shared_task(
    base=RLSTask,
    bind=True,
    autoretry_for=(ConnectionError, TimeoutError, OperationalError),
    retry_backoff=True,
    retry_backoff_max=600,
    retry_jitter=True,
    max_retries=5,
    soft_time_limit=300,
    time_limit=360,
)
@set_tenant
def sync_provider_data(self, tenant_id, provider_id):
    with rls_transaction(tenant_id):
        # ... task logic
        pass

# Idempotent task - safe to retry
@shared_task(base=RLSTask, acks_late=True)
@set_tenant
def process_finding(tenant_id, finding_uid, data):
    with rls_transaction(tenant_id):
        Finding.objects.update_or_create(uid=finding_uid, defaults=data)

Production Deployment Checklist

Full settings: See references/production-settings.md

Run before every production deployment:

cd api && poetry run python src/backend/manage.py check --deploy

Critical Settings

SettingProduction ValueRisk if Wrong
DEBUGFalseExposes stack traces, settings, SQL queries
SECRET_KEYEnv var, rotatedSession hijacking, CSRF bypass
ALLOWED_HOSTSExplicit listHost header attacks
SECURE_SSL_REDIRECTTrueCredentials sent over HTTP
SESSION_COOKIE_SECURETrueSession cookies over HTTP
CSRF_COOKIE_SECURETrueCSRF tokens over HTTP
SECURE_HSTS_SECONDS31536000 (1 year)Downgrade attacks
CONN_MAX_AGE60 or higherConnection pool exhaustion

Commands

# Development
cd api && poetry run python src/backend/manage.py runserver
cd api && poetry run python src/backend/manage.py shell

# Celery
cd api && poetry run celery -A config.celery worker -l info -Q scans,overview
cd api && poetry run celery -A config.celery beat -l info

# Testing
cd api && poetry run pytest -x --tb=short

# Production checks
cd api && poetry run python src/backend/manage.py check --deploy

Resources

Local References

Related Skills

  • Generic DRF Patterns: Use django-drf skill
  • API Testing: Use prowler-test-api skill

Context7 MCP (Recommended)

Prerequisite: Install Context7 MCP server for up-to-date documentation lookup.

When implementing or debugging Prowler-specific patterns, query these libraries via mcp_context7_query-docs:

LibraryContext7 IDUse For
Celery/websites/celeryq_dev_en_stableTask patterns, queues, error handling
django-celery-beat/celery/django-celery-beatPeriodic task scheduling
Django/websites/djangoproject_en_5_2Models, ORM, constraints, indexes

Example queries:

mcp_context7_query-docs(libraryId="/websites/celeryq_dev_en_stable", query="shared_task decorator retry patterns")
mcp_context7_query-docs(libraryId="/celery/django-celery-beat", query="periodic task database scheduler")
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_2", query="model constraints CheckConstraint UniqueConstraint")

Note: Use mcp_context7_resolve-library-id first if you need to find the correct library ID.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

zod-4

No summary provided by upstream source.

Repository SourceNeeds Review
General

react-19

No summary provided by upstream source.

Repository SourceNeeds Review
General

tailwind-4

No summary provided by upstream source.

Repository SourceNeeds Review
General

zustand-5

No summary provided by upstream source.

Repository SourceNeeds Review