
Shaw (wartime arc)|Feb 24, 2025 05:36
The solution to AI alignment is transparency
But that means having hard conversations with complex moral answers
The models are biased by the information they have been trained on, which is heavily saturated by media which tends to associate words like disinformation to people who have been attacked a lot in the media. The only way to make the models consistently state one position or another is to either RLHF the shit out of them or have a real and forthcoming conversation about what is valued in any given community and put that in the prompt.
We could “balance” the dataset, but what would that really mean? That reduces to deleting content from the memory of the model, which makes it less general.
So prompting is how we express our biases and values. And we should do it openly.
If that’s dictator shit then, well, you reap what you sow in this world, but at least it’s a collective understanding and not some clandestine manipulation of the algorithm
Like a social smart contract
Share To
Timeline
HotFlash
APP
X
Telegram
CopyLink