Skip to content

AI Spam Detection ​

Automatically detect and filter spam content using AI-powered analysis.

Overview ​

AI spam detection analyzes comments and feature requests to identify spam, toxic content, and policy violations.

Requirements

Requires API key from supported provider (OpenAI, Anthropic, Gemini, Azure, Perspective, or Tisane). Can also use local keyword-based detection without API keys.

Setup ​

  1. Settings β†’ Spam Settings
  2. Select spam detection provider
  3. Enter API key
  4. Configure thresholds
  5. Enable detection

Detection Methods ​

AI-Powered Detection ​

Detected content:

  • Spam and promotional content
  • Hate speech
  • Harassment
  • Violence
  • Sexual content
  • Profanity
  • Toxicity

Providers:

  • OpenAI Moderation API
  • Anthropic Claude
  • Google Gemini
  • Azure Content Safety
  • Google Perspective API
  • Tisane AI

Local Keyword Detection ​

Fallback option without API keys:

  • Custom keyword lists
  • Balanced or strict presets
  • Multi-language support

Configuration ​

Strictness Levels ​

Strict:

  • Low tolerance
  • High false positives
  • Maximum protection

Medium: (Recommended)

  • Balanced approach
  • Fewer false positives
  • Good protection

Lenient:

  • High tolerance
  • Minimal false positives
  • Light moderation

Confidence Thresholds ​

Set minimum score for action (0.0 - 1.0):

  • 0.8-1.0: Very confident spam
  • 0.5-0.7: Likely spam
  • 0.0-0.4: Probably legitimate

Auto-actions:

  • Score β‰₯ 0.8: Auto-block
  • Score β‰₯ 0.6: Auto-hide, require review
  • Score < 0.6: Allow, flag for review

Automated Actions ​

Auto-Block ​

Immediately block high-confidence spam:

  • Content not posted
  • User notified
  • Admin log created

Auto-Hide ​

Hide suspicious content pending review:

  • Content hidden from public
  • Visible to admins
  • Can be approved/rejected

Require Admin Approval ​

All flagged content requires manual review.

Trust Score Integration ​

Spam detection affects user Trust Scores:

  • Confirmed spam: -10 points
  • False positive (admin approved): +5 points
  • Low trust users: increased scrutiny

Reviewing Flagged Content ​

  1. Admin β†’ Spam Queue
  2. Review flagged items
  3. Options:
    • Approve (not spam)
    • Confirm spam (delete)
    • Ban user (severe cases)

Performance Monitoring ​

Track spam detection:

  • False positive rate
  • Total spam blocked
  • User impact
  • API usage and costs

Best Practices ​

  • Start with medium strictness
  • Monitor false positives
  • Adjust thresholds based on results
  • Combine with Trust Score system
  • Review spam queue regularly

Next Steps ​

Β© 2024 - Corbital Technologies. All rights reserved.