Skip to content

Security: scthornton/trylock

Security

SECURITY.md

Security Policy

Open-Source AI Security Research Project

This repository contains TRYLOCK v2.0, an open-source research project focused on improving Large Language Model (LLM) resistance to adversarial prompt-based attacks. This is a defensive security research project with educational and research purposes.

Project Purpose

TRYLOCK provides:

  • ✅ Research on adversarial LLM defense mechanisms
  • ✅ Open-source training pipeline for attack-resistant models
  • ✅ Public datasets for security research
  • ✅ Defensive AI security tools and methodologies
  • ✅ Enterprise-grade LLM protection strategies

Important: This is defensive security research intended to improve AI safety and security, not to facilitate attacks.

Defensive Research Context

Research Ethics

Authorized Research:

  • Academic and industry security research
  • Defensive AI security development
  • Model robustness improvement
  • Adversarial training methodologies
  • Security benchmarking and evaluation

Responsible Disclosure:

  • Vulnerabilities disclosed through proper channels
  • Coordination with affected vendors
  • Ethical testing in authorized environments
  • Focus on defensive capabilities

Attack Dataset and Methodology

Training Data:

  • Public adversarial attack datasets
  • Ethically sourced attack examples
  • Multi-turn conversation scenarios
  • Documented attack taxonomies
  • Research-focused demonstrations

Data Privacy:

  • No real user data in training sets
  • Synthetic attack examples
  • Privacy-preserving research methods
  • Anonymized benchmarks

Reporting Security Issues

Research Project Vulnerabilities

For security issues with the TRYLOCK codebase, training pipeline, or models:

Email: scott@perfecxion.ai

Report issues such as:

  • Code vulnerabilities in training pipeline
  • Model security bypass techniques
  • Dataset privacy concerns
  • Deployment security issues
  • Evaluation framework vulnerabilities

Response Timeline:

  • Initial Response: Within 48 hours
  • Assessment: Within 7 days
  • Resolution: Based on severity and research impact

Responsible Disclosure

Novel Attack Discovery: If you discover novel attack vectors against TRYLOCK defenses or LLMs in general:

  1. Document the attack methodology and effectiveness
  2. Test in authorized environments only
  3. Report to scott@perfecxion.ai with technical details
  4. Coordinate on disclosure timeline (typically 90 days)
  5. Collaborate on potential defenses and mitigations

Recognition: Security researchers who responsibly disclose vulnerabilities will be credited in project documentation and research papers (with permission).

Security Considerations

Defensive Research Tool

TRYLOCK is designed for defense:

  • Improves model resistance to attacks
  • Reduces attack success rates from ~40% to ~10-15%
  • Maintains usability for legitimate queries
  • Provides enterprise-grade protection

Not an Offensive Tool:

  • Does not teach new attack techniques
  • Focuses on known attack patterns
  • Defensive capabilities only
  • Research and educational purposes

Training Data Security

Dataset Privacy:

  • No personally identifiable information (PII)
  • No real user conversations
  • Synthetic attack scenarios
  • Public threat intelligence only
  • Privacy-preserving research methods

Data Handling:

  • .gitignore prevents accidental commits of private data
  • data/tier1_open/, data/tier2_adversarial/, data/tier3_proprietary/ excluded
  • Public sample data provided for demonstration
  • Full training data available on request for research purposes

Model Security

Trained Models:

  • Available on HuggingFace Hub
  • Apache 2.0 licensed for research and commercial use
  • Defensive security models only
  • Comprehensive evaluation results published
  • Transparent methodology and limitations

Security Limitations:

  • No defense is perfect (10-15% attacks still succeed)
  • Evolving threat landscape requires updates
  • Trade-offs between security and usability
  • False positives minimized but not eliminated

Supported Versions

Version Supported
2.0.x
1.x ❌ (legacy)

Note: TRYLOCK v2.0 represents a complete architecture redesign with three-layer defense stack.

Best Practices

For Security Researchers

1. Responsible Testing

  • Test only in authorized environments
  • Use isolated systems for adversarial research
  • Don't attack production systems
  • Follow ethical hacking guidelines
  • Coordinate disclosure of findings

2. Research Ethics

  • Focus on defensive capabilities
  • Share findings with security community
  • Contribute to open-source defenses
  • Respect privacy and data protection
  • Follow academic integrity standards

3. Dataset Usage

  • Use public datasets responsibly
  • Don't misuse attack examples
  • Respect data licensing terms
  • Cite sources appropriately
  • Maintain privacy standards

4. Model Deployment

  • Evaluate security trade-offs (alpha tuning: 0.0-2.5)
  • Monitor false positive rates
  • Test with legitimate use cases
  • Document security configuration
  • Maintain update strategy

For Enterprise Deployers

1. Security Configuration

  • Alpha = 0.0: Research mode (minimal filtering)
  • Alpha = 1.0: Balanced mode (85-90% protection, <5% FP)
  • Alpha = 2.5: Lockdown mode (95%+ protection, higher FP)

2. Deployment Security

  • Use Docker containers for isolation
  • Implement monitoring and logging
  • Regular model updates for new threats
  • Incident response procedures
  • Security audit trails

3. Integration Best Practices

  • Test with representative workloads
  • Measure false positive rates
  • Balance security vs. usability
  • Train users on limitations
  • Monitor for adversarial attempts

4. Risk Management

  • No defense is perfect (10-15% residual risk)
  • Layer with other security controls
  • Regular security assessments
  • Update threat models
  • Incident response planning

For Model Users

1. Understanding Limitations

  • TRYLOCK reduces attacks by 88% (not 100%)
  • Some legitimate queries may be flagged
  • Evolving attacks may bypass defenses
  • Regular updates recommended
  • Defense-in-depth approach needed

2. Reporting Issues

  • Report false positives to improve model
  • Share novel attack patterns discovered
  • Provide feedback on usability
  • Suggest improvements
  • Collaborate on research

Research Transparency

Open Science

Public Resources:

  • Training code on GitHub
  • Models on HuggingFace Hub
  • Public demonstration dataset
  • Evaluation methodology
  • Research paper and documentation

Reproducibility:

  • Complete training pipeline
  • Evaluation scripts and benchmarks
  • Hyperparameter configurations
  • Random seeds for reproducibility
  • Docker environments for consistency

Academic Collaboration

Research Partnerships:

  • Open to academic collaborations
  • Industry research partnerships
  • Security community contributions
  • Peer review and validation
  • Co-authorship on derivative work

Citation: If you use TRYLOCK in your research, please cite appropriately and reference this repository.

Compliance and Ethics

Ethical AI Research

Principles:

  • Defensive research focus
  • Responsible disclosure practices
  • Privacy-preserving methods
  • Transparent methodology
  • Open-source contribution

Standards:

  • Follow NIST AI Risk Management Framework
  • Adhere to IEEE AI ethics guidelines
  • Respect privacy regulations (GDPR, CCPA)
  • Academic research integrity
  • Responsible AI development

License Compliance

Apache 2.0 License:

  • Commercial use permitted
  • Modification allowed
  • Distribution permitted
  • Patent grant included
  • Attribution required

See LICENSE for full terms.

Resources

AI Security Research

Frameworks and Standards:

  • OWASP LLM Top 10 vulnerabilities
  • NIST AI Risk Management Framework
  • MITRE ATLAS (Adversarial Threat Landscape)
  • AI Incident Database (AIID)

Research Community:

  • ML Security workshops (NeurIPS, ICML)
  • Adversarial ML research groups
  • AI safety organizations
  • Security conference tracks

Technical Documentation

Project Resources:

  • Training and evaluation guides
  • Deployment documentation
  • API integration patterns
  • Benchmark results and analysis
  • Research paper and methodology

External Resources:

  • HuggingFace documentation
  • PyTorch training guides
  • Adversarial ML literature
  • LLM security research

Contact

For questions about TRYLOCK research, security concerns, or collaboration opportunities, contact scott@perfecxion.ai.


Reminder: TRYLOCK is defensive security research. Use responsibly, test ethically, and contribute to improving AI security for everyone.

There aren’t any published security advisories