AI Spam Detection β
Automatically detect and filter spam content using AI-powered analysis.
Overview β
AI spam detection analyzes comments and feature requests to identify spam, toxic content, and policy violations.
Requirements
Requires API key from supported provider (OpenAI, Anthropic, Gemini, Azure, Perspective, or Tisane). Can also use local keyword-based detection without API keys.
Setup β
- Settings β Spam Settings
- Select spam detection provider
- Enter API key
- Configure thresholds
- Enable detection
Detection Methods β
AI-Powered Detection β
Detected content:
- Spam and promotional content
- Hate speech
- Harassment
- Violence
- Sexual content
- Profanity
- Toxicity
Providers:
- OpenAI Moderation API
- Anthropic Claude
- Google Gemini
- Azure Content Safety
- Google Perspective API
- Tisane AI
Local Keyword Detection β
Fallback option without API keys:
- Custom keyword lists
- Balanced or strict presets
- Multi-language support
Configuration β
Strictness Levels β
Strict:
- Low tolerance
- High false positives
- Maximum protection
Medium: (Recommended)
- Balanced approach
- Fewer false positives
- Good protection
Lenient:
- High tolerance
- Minimal false positives
- Light moderation
Confidence Thresholds β
Set minimum score for action (0.0 - 1.0):
- 0.8-1.0: Very confident spam
- 0.5-0.7: Likely spam
- 0.0-0.4: Probably legitimate
Auto-actions:
- Score β₯ 0.8: Auto-block
- Score β₯ 0.6: Auto-hide, require review
- Score < 0.6: Allow, flag for review
Automated Actions β
Auto-Block β
Immediately block high-confidence spam:
- Content not posted
- User notified
- Admin log created
Auto-Hide β
Hide suspicious content pending review:
- Content hidden from public
- Visible to admins
- Can be approved/rejected
Require Admin Approval β
All flagged content requires manual review.
Trust Score Integration β
Spam detection affects user Trust Scores:
- Confirmed spam: -10 points
- False positive (admin approved): +5 points
- Low trust users: increased scrutiny
Reviewing Flagged Content β
- Admin β Spam Queue
- Review flagged items
- Options:
- Approve (not spam)
- Confirm spam (delete)
- Ban user (severe cases)
Performance Monitoring β
Track spam detection:
- False positive rate
- Total spam blocked
- User impact
- API usage and costs
Best Practices β
- Start with medium strictness
- Monitor false positives
- Adjust thresholds based on results
- Combine with Trust Score system
- Review spam queue regularly
Next Steps β
- Content Moderation - Manual moderation
- Trust Score System - User reputation