Survey Analyzer

Comprehensive survey data analysis with Likert scales, cross-tabs, and sentiment analysis.

Features

Likert Scale Analysis: Agreement scale scoring and visualization
Cross-Tabulation: Relationship analysis between categorical variables
Frequency Analysis: Response distributions and percentages
Sentiment Scoring: Text response sentiment analysis
Open-Ended Analysis: Theme extraction from text responses
Statistical Tests: Chi-square, correlations, significance testing
Visualizations: Bar charts, heatmaps, word clouds, distribution plots
Report Generation: Comprehensive PDF/HTML reports

Quick Start

from survey_analyzer import SurveyAnalyzer

analyzer = SurveyAnalyzer()

# Load survey data
analyzer.load_csv('survey_responses.csv')

# Analyze Likert scale question
results = analyzer.likert_analysis('satisfaction', scale_type='agreement')
print(f"Mean score: {results['mean_score']:.2f}")

# Cross-tabulation
crosstab = analyzer.crosstab('age_group', 'product_preference')
print(crosstab)

# Generate report
analyzer.generate_report('survey_report.pdf')

CLI Usage

# Analyze Likert scale
python survey_analyzer.py --data survey.csv --likert satisfaction --output results.pdf

# Cross-tabulation
python survey_analyzer.py --data survey.csv --crosstab age_group product --output crosstab.png

# Sentiment analysis
python survey_analyzer.py --data survey.csv --sentiment comments --output sentiment.html

# Full report
python survey_analyzer.py --data survey.csv --report --output full_report.pdf

API Reference

SurveyAnalyzer Class

class SurveyAnalyzer:
    def __init__(self)

    # Data Loading
    def load_csv(self, filepath, **kwargs) -> 'SurveyAnalyzer'
    def load_data(self, data: pd.DataFrame) -> 'SurveyAnalyzer'

    # Likert Scale Analysis
    def likert_analysis(self, column, scale_type='agreement') -> Dict
    def likert_comparison(self, columns: List[str]) -> pd.DataFrame
    def plot_likert(self, column, output, scale_type='agreement') -> str

    # Frequency Analysis
    def frequency_table(self, column) -> pd.DataFrame
    def multiple_choice(self, column, delimiter=',') -> pd.DataFrame
    def plot_frequencies(self, column, output, top_n=None) -> str

    # Cross-Tabulation
    def crosstab(self, row_var, col_var, normalize=None) -> pd.DataFrame
    def chi_square_test(self, row_var, col_var) -> Dict
    def plot_crosstab(self, row_var, col_var, output) -> str

    # Sentiment Analysis
    def sentiment_analysis(self, column) -> pd.DataFrame
    def sentiment_summary(self, column) -> Dict
    def plot_sentiment(self, column, output) -> str

    # Open-Ended Analysis
    def word_frequency(self, column, top_n=20) -> pd.DataFrame
    def word_cloud(self, column, output) -> str
    def extract_themes(self, column, n_themes=5) -> List[str]

    # Statistics
    def satisfaction_score(self, columns: List[str]) -> Dict
    def response_rate(self) -> Dict
    def demographics_summary(self, columns: List[str]) -> pd.DataFrame

    # Reporting
    def generate_report(self, output, format='pdf') -> str
    def summary(self) -> str

Likert Scale Analysis

Standard Scales

# 5-point agreement scale
analyzer.likert_analysis('satisfaction', scale_type='agreement')
# 1=Strongly Disagree, 2=Disagree, 3=Neutral, 4=Agree, 5=Strongly Agree

# 5-point frequency scale
analyzer.likert_analysis('usage', scale_type='frequency')
# 1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always

# Custom scale
analyzer.likert_analysis('rating', scale_type='custom',
                        labels=['Poor', 'Fair', 'Good', 'Excellent'])

Results

results = analyzer.likert_analysis('satisfaction')
# {
#     'mean_score': 4.2,
#     'median': 4,
#     'mode': 5,
#     'distribution': {1: 2, 2: 5, 3: 15, 4: 40, 5: 38},
#     'percentages': {1: 2%, 2: 5%, 3: 15%, 4: 40%, 5: 38%},
#     'top_2_box': 78%,  # % Agree + Strongly Agree
#     'bottom_2_box': 7%  # % Disagree + Strongly Disagree
# }

Visualization

# Stacked bar chart
analyzer.plot_likert('satisfaction', 'likert_chart.png')

# Compare multiple questions
analyzer.likert_comparison(['quality', 'value', 'service'])
analyzer.plot_likert_comparison(['quality', 'value', 'service'],
                               'comparison.png')

Frequency Analysis

Single Choice

freq = analyzer.frequency_table('age_group')
#          Count  Percentage
# 18-24      45      22.5%
# 25-34      78      39.0%
# 35-44      52      26.0%
# 45+        25      12.5%

# Plot
analyzer.plot_frequencies('age_group', 'age_distribution.png')

Multiple Choice

For questions allowing multiple selections:

# Data format: "Option A, Option B, Option C"
results = analyzer.multiple_choice('features_liked', delimiter=',')
#                Count  Percentage
# Price            120      60%
# Quality           95      47.5%
# Design            80      40%
# Durability        70      35%

analyzer.plot_frequencies('features_liked', 'features.png', top_n=10)

Cross-Tabulation

Basic Cross-Tab

crosstab = analyzer.crosstab('age_group', 'satisfaction')
#           Satisfied  Neutral  Dissatisfied
# 18-24          30       10           5
# 25-34          60       15           3
# 35-44          40        8           4
# 45+            18        5           2

# With percentages
crosstab_pct = analyzer.crosstab('age_group', 'satisfaction',
                                 normalize='index')  # Row percentages

Statistical Testing

result = analyzer.chi_square_test('age_group', 'satisfaction')
# {
#     'statistic': 12.45,
#     'p_value': 0.014,
#     'significant': True,
#     'interpretation': 'There is a significant relationship between
#                        age_group and satisfaction (p=0.014)'
# }

Visualization

# Heatmap
analyzer.plot_crosstab('age_group', 'satisfaction', 'crosstab_heatmap.png')

Sentiment Analysis

Analyze open-ended text responses:

# Analyze all comments
sentiment_df = analyzer.sentiment_analysis('comments')
#      comment                          polarity  sentiment
# 0    "Great product!"                  0.8      Positive
# 1    "Could be better"                 0.1      Neutral
# 2    "Very disappointed"              -0.6      Negative

# Summary
summary = analyzer.sentiment_summary('comments')
# {
#     'positive': 65%,
#     'neutral': 20%,
#     'negative': 15%,
#     'avg_polarity': 0.35
# }

# Visualize
analyzer.plot_sentiment('comments', 'sentiment_distribution.png')

Open-Ended Analysis

Word Frequency

words = analyzer.word_frequency('comments', top_n=20)
#        Word   Frequency
# 0     great       45
# 1   quality       38
# 2     price       32
# ...

Word Cloud

analyzer.word_cloud('comments', 'wordcloud.png')

Theme Extraction

themes = analyzer.extract_themes('feedback', n_themes=5)
# ['product quality', 'customer service', 'pricing',
#  'delivery speed', 'user experience']

Satisfaction Metrics

Net Promoter Score (NPS)

nps = analyzer.nps_score('recommendation')  # 0-10 scale
# {
#     'promoters': 65%,   # 9-10
#     'passives': 25%,    # 7-8
#     'detractors': 10%,  # 0-6
#     'nps': 55
# }

Overall Satisfaction

satisfaction = analyzer.satisfaction_score([
    'product_quality',
    'customer_service',
    'value_for_money',
    'ease_of_use'
])
# {
#     'overall_score': 4.3,
#     'category_scores': {...},
#     'satisfaction_rate': 86%  # % scoring 4-5
# }

Demographics Analysis

demographics = analyzer.demographics_summary([
    'age_group',
    'gender',
    'location',
    'income_range'
])

# Returns frequency tables for each demographic variable

Response Rate Analysis

response_rate = analyzer.response_rate()
# {
#     'total_respondents': 200,
#     'completion_rate': 85%,
#     'average_time': '5m 30s',
#     'dropout_points': {
#         'question_5': 8%,
#         'question_12': 5%
#     }
# }

Report Generation

Comprehensive Report

analyzer.generate_report('survey_report.pdf', format='pdf')

Report includes:

Executive summary
Response rate and demographics
Question-by-question analysis
Likert scale visualizations
Cross-tabulations
Sentiment analysis
Key findings and recommendations

Custom Report Sections

analyzer.set_report_sections([
    'executive_summary',
    'demographics',
    'likert_questions',
    'cross_tabs',
    'sentiment',
    'recommendations'
])

Advanced Features

Filter by Segment

# Analyze subset of responses
analyzer.filter('age_group', '25-34')
results = analyzer.likert_analysis('satisfaction')
analyzer.clear_filter()

Compare Segments

comparison = analyzer.compare_segments(
    segment_col='age_group',
    metric_col='satisfaction'
)
# Shows how different segments scored the metric

Trend Analysis

For longitudinal surveys:

trends = analyzer.trend_analysis(
    metric='satisfaction',
    time_col='survey_date',
    period='month'
)
analyzer.plot_trends(trends, 'satisfaction_trend.png')

Dependencies

pandas>=2.0.0
numpy>=1.24.0
scipy>=1.10.0
textblob>=0.17.0
matplotlib>=3.7.0
seaborn>=0.12.0
wordcloud>=1.9.0
reportlab>=4.0.0

survey-analyzer

Safety Notice

Copy this and send it to your AI assistant to learn

Survey Analyzer

Features

Quick Start

CLI Usage

API Reference

SurveyAnalyzer Class

Likert Scale Analysis

Standard Scales

Results

Visualization

Frequency Analysis

Single Choice

Multiple Choice

Cross-Tabulation

Basic Cross-Tab

Statistical Testing

Visualization

Sentiment Analysis

Open-Ended Analysis

Word Frequency

Word Cloud

Theme Extraction

Satisfaction Metrics

Net Promoter Score (NPS)

Overall Satisfaction

Demographics Analysis

Response Rate Analysis

Report Generation

Comprehensive Report

Custom Report Sections

Advanced Features

Filter by Segment

Compare Segments

Trend Analysis

Dependencies

Source Transparency

Related Skills

scientific-paper-figure-generator

ocr-document-processor

crypto-ta-analyzer

text-summarizer