Skip to content

Controlling AI with AI. Conative Gating introduces a second model trained with inverted incentives rewarded for blocking, suspicious by default, adversarial to the LLM’s proposals, using metaphors from human constraint.

License

Notifications You must be signed in to change notification settings

hyperpolymath/conative-gating

About

Controlling AI with AI. Conative Gating introduces a second model trained with inverted incentives rewarded for blocking, suspicious by default, adversarial to the LLM’s proposals, using metaphors from human constraint.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published

Contributors 3

  •  
  •  
  •