Note: This document is preliminary. Some of the concepts refer to my 1-st stage design and not current work in adapting SPKI or KEYNOTE/RFC 2704.
Public key cryptography gave us the digital signature. A digital signature allows you to receive a document from any channel, and be sure that the document was generated by a person who knew a special secret key. That secret key has a corresponding public version. If you can find out in some sure way what a particular person's public key is -- for example, you meet them and they tell you -- then you can later check to be sure a document that says it is from them is from them.
Of course you can't meet everybody and confirm their identity. And their "identity" may have little meaning anyway. I may be able to prove I am "Brad Templeton" but what does that mean?
All the answers come in a special use of digital signature called a certificate.
Certificates allow you deal through intermediaries in a chain of trust to get some information you can trust from just about anybody in the world.
It starts with somebody you do trust. Perhaps it's somebody famous or a big company. They published their public key in many ways -- it's on their web site, they read it on TV, they mailed it to you, it appears every week in a newsgroup. Enough ways that you are sure that nobody was able to compromise them all. This person can be called a "root signer."
A certificate is a document that is signed by somebody you know that tells you something about somebody you don't know. In particular, it tells you their public key, and something else about them you want to know.
Let's say for example that I am the moderator of rec.humor.funny. You don't know me, but we both know the root signer described above. I meet her, and tell her my public key, and she makes sure of my name and that I really am the moderator of rec.humor.funny.
Then she makes up a document that says:
"Brad Templeton (brad@clari.net) who has the public key XXXX, is the moderator of rec.humor.funny."
(Of course this is said in a computer-formal way, not in English.)
And she signs this document using her secret key. THen I send a message to you. I sign that message (with my secret key) and I include with it a copy of the certificate. First you read the certificate, and you check it out. You check that it was really signed by the root signer -- you can do that because you, and everybody else, has a copy of her public key.
You don't have to fetch my key or the certificate from anywhere, a copy is included in the message.
Once you do that you now can be sure of my public key, or at least as sure as the root signer was. And you can also be sure the root signer thought I was the moderator of rec.humor.funny.
And now you can look at the rest of the message. You can check my signature on it now, using my public key, though you never met me and don't know me from Adam. And you know for sure the message came from a person named Brad Templeton and that it came from a person the root signer said was the moderator. The first may actually be boring, but the latter can allow you to let the message be acted as one from the moderator. Ie. you might let it go into that newsgroup, when otherwise you would not.
As you'll read later, the two facts do not need to be connected. One might have a key certified as moderated of the newsgroup without saying who the keyholder is. That's because it really doesn't matter to you who the keyholder is, simply that they have been trusted, via a chain, to moderate the group.
You don't have to record who the moderator is. You only have to know that the group is moderated. The certificate does the rest.
My certificate might give me more power. It might say I have the right to do an newgroup or rmgroup of any group starting with "rec.humor.funny.*" It might say I can cancel any message in the group. It might say I can cancel any message from the site *.clari.net" because the root signer knew I was an admin at that site.
And it might say that I have the power to issue other certificates granting a subset of my own powers.
The root signer can't meet everybody and confirm everything. But she can delegate that power, by signing a certificate saying, "This persons is allowed to issue certificates" or "This person can issue certificates to do rmgroup in the comp.* hierarchy."
So we can have a chain of certificates and in fact you can confirm just about anything this way. The further you get from the root signer that you trust, the more risk you have that the line of trust can be broken, but exponentiation being what it is, you don't need many levels to get something that can verify just about anything.
So in theory you can have a document that goes like this:
For a distributed network like USENET, having multiple certificates -- or even just one -- can be too bulky. So while a document can look like this, it's a lot easier if Alice collects up all these certificates, and mails it to a machine run by the Root signer. The root signer sends back to Alice a modified version of her whole document. All the sigs and certificates are gone. All that is left is a new line in the document saying,
"I the root signer followed a certificate chain and confirm this document came from Alice"
(To be more compact, it says, "I confirm this came from the e-mail address in the From: line.)
And the rest of Alice's document -- the one that Alice signed. But now there is just one signature on the whole thing, and it only costs about 70 bytes. You only have to check one signature, and it is done with a public key you have stored.
Chances are actually that the root signer machine doesn't even mail the post back to Alice except as a courtesy. It can just post it directly to USENET.
The certificate collapser can do a lot more, too. It can check for any problems in Alice's message and tell her about them, generate a message-id, control her posting volume and a variety of other things.
There doesn't have to be one root signer, and in fact usually there would not be. That's both because you want redundancy -- in case one root signer is 'down' -- and to avoid centralizing of control.
If you have more than one root signer, everybody has to keep on hand the public keys of all the root signers. They might be on web sites or broadcast on the net. For example, a new root signer might be heralded by a message signed by half or more of the other root signers saying, "Lo, here is Dave. His key is vvvvv and he is a root signer."
Thus anything, including annointing, can be done by a vote and the vote is checked not by a returning officer, but by everybody. The list of root signers, signed by all the root signers, can be re-issued regularly as well as being available on HTTP, FTP and other servers.
Your computer might scan for such messages, and if it sees them, put Dave's key in the database. So it's not a lot of work to maintain keys for perhaps 20 or so root signers. In fact, even maintaining a few hundred is reasonable, so that sub-root signers are possible. (You want that because in fact signing machines are an attack target, so you don't want them to have a top level root key.)
A good key system has keys that expire, even the root signers, and even the most ordinary user's. You have to get a new key every month or every year. But that can be automated. Your system can send in message, signed with your old key, staying, "I'm Fred. Here is my new key." to your certifier. Your certifier can then give you a new certificate, good for another month or year. If however, there is a police report out about how somebody is posing as Fred and stealing his key, the certifier will not issue this. Instead, they'll take extra steps to confirm that the real Fred is asking for a certificate, just as they did the first time.
In addition, the owner of a key may want the power to issue a message saying, "My key is no good anymore" or they may want to leave that with a certifier. Sometimes if your key is stolen you may find being able to do nothing is worth than the risk of being impersonated.
The most basic USENET certificate would be one saying, "Public key XXX can post as brad@clari.net" and in fact that is what most people will have. But the certificate may have other attributes in it.
To make all this work a simple certification language is needed, which specifies powers, inheritable powers and rules of inheritance, as well as newsgroup patterns over which the powers apply.
Certificates would be written in this language. The most basic terms would be ones like "e=email-address" to specify the key belongs to an email address and the user is a valid poster. That certificate probably was issued by a challenge response. The user's software picked a key. It mailed the key to a certifier. The certifier mailed back to the From: address and said, "here's your certificate." If E-mail is working right, only the true reader of an E-mail box could get a certificate for it.
(If their mailbox security is broken, then anybody who can read it could get a fake certificate, but that's life. Evidence is left. Ditto if a mailing list address is used, which is why you may want a 4 step instead of 2-step process for this, but read the technical specs for details on all of this.)
Higher level certificates -- moderators, newgroup issuers etc. -- would be done with higher security. Phone calls, personal meetings and verified chains of trust.
You could give one root signer all the power. Or you could say that certain certificates only come when signed by a minimum number of signers, so that no one party has the "power." This system is full of accountability, so if somebody abuses power, they can't really do it in secret. These keys are all for signing public messages after all.
A typical header would look like:
Auth: auth-type Key-Name [flags] signature
The auth-type is the type of authentication scheme used. The scheme described here would be type "U" probably.
The Key-Name is a name for the key. This could be a root-signers key stored in a local database, A URL for a key, or a number, to refer to a key signed in a later certificate. After certificate collapse, this is a root signer's key or special sub-root signers key, which is cached on your machine or can be fetched in a number of well known ways.
The flags include:
Which signing algorithm to use is implied from the key.
The signature itself is the base64 encoding of the digital signature of the crypographically secure hash of the document and most of its headers. The hash is not stored, that you can calculate. Just the signature. The signature is in the range of 54 bytes.
Roughly:
Auth: U master1 sdfadslfsdfkljflkjljljlkjlj;23423423423423432234
With certificate collapse the authentication line is around 65 bytes. In theory, after relaying the article to users, a reading system need not keep the Authentication line with the article if it doesn't want to. But usually it will for those users who don't trust the site and want to authenticate themselves.
Certificate: auth-type key-id certifiers-key-name Attributes [flags] certified-key : signature
These are much longer. The key-id is the key name, and may be a plus sign followed by an integer to indicate a key in the same document so tagged by the flags. The certifiers-key-name is the certifier's key name.
The Attributes are an encoding in the certification language described above, of the authorities the user has. The default is "e=%f" meaning the right to use the email address found in the From: line.
The flags are flags to affect this process, possibly hash type etc.
The certified-key is the key being certified. It's long and base-64 encoded.
The signature is in this case calculated by taking a crypt-secure hash of the entire certificate (except for the this signature of course) to the end of the flags, but replacing numeric key numbers with key names. This hash result is then signed and encoded in base64.
There may be multiple certificate headers. They are not redundant. (Perhaps they should be Certificate1, Certificate2, ...?)
Example:
Certificate: U +1 kahuna-certifier3 e=%f f-%F x=%d trust/comp..* sldfjsalfjalaskfhalshfalskfhaslfhaslhaslhalsahkslfhaslkhkhjlkhsdfadsfd 123123123123 : 79asd8f7a0s9df87as90df87as90df87a0d9sf87a8as9df78 s0d987fa0s9df87as90df7as90df87as90df7asd90f87ad90f7as9087df90a8787 asdfa7sd0f987sad0f978adf90a78f98df790sdf7a90sdf7asd90f78907889787a asd08f7a90s
Though not explained in this document, the certificate grants the power to use the email address and full name from the From: line, expires on Oct 31 of this year, and the user is a "trusted" poster in the comp hierarchy, meaning they can post without throttle and even do followups that are crossposted.
(Before you worry, the inserting of %f from the From: line is safe, in that if a malicious user uses this certificate in a post from other than that line, the signature on the certificate will not match. The hashing of the certificate is done after substitution of %f, not before. %f is just a way to keep the certificate short when it uses information that's already found in every article.)
A powers header is used after certificate collapse. The Powers header is part of the new article to be signed without the certificates. It contains the a statement by the certificate collapser of the verified powers that were certified in the original message. Generally it will not be all the power the user has, just the ones needed for this message.
The default, if not present, is "e=%f", the right to use the E-mail address found in the From: line of the message.
Note, this header may not be necessary. It may be better simply to arrange that articles needing special powers do not go through certificate collapse.
If 70 bytes extra per article seems high, another system is possible that absolutely requires central signing and certificate collapse.
Certificate collapsing systems could batch up their work, issuing articles in one minute batches -- perhaps 50 articles in a batch. They could then issue a "digest." This digest would list each message-id, and the hash for the message under that message-id. The digest itself would then be signed.
Receiving sites could receive these digests and store them in the history file or similar if the article has not yet arrived. Articles that arrive before their digest would be put in a pending queue. When both article and digest are present, hash the article and compare with the hash from the digest. If they match, it's a valid article.
This means overhead -- in space, and in CPU -- can be made as small as you like. There are other big wins from batching -- more later.
Q: What if somebody wants to post under a false name?
A: A spec is underway to allow people to specify they are using a false name. Specifying is the polite thing to do, it saves people from writing replies to addresses that aren't there. And be aware that in many newsgroups, a lot of people and sites would just discard such articles. That is their choice. In some groups where anonymity is encouraged, people would not.
Some people want to put a false name in their post and put their real one in English to stop address harvesters. They can use the anonymous convention if they want, but generally this is not the answer to spam. Just as you hate the spammers for using your time and resources to send you junk E-mail, other people hate just as much your posting to the net but being unwilling to let them write back to you the normal way. Want to post to the net? Expect that people can write you back.
Q: Won't this add a lot to articles?
A: If all articles are signed with certificate collapse, about 70 bytes. If bulk signing is used, a negligable amount. If certificate collapse is not used for posting, but for certificate issue, about 260 bytes to each article, again depending on key length. That's a fair chunk, but frankly an authenticated net is worth it. NNTP servers can be modified to not send down authentication headers to clients, that don't really need them, so the extra load is not on clients and modem users.
Q: Who will set all the policies?
A: Right now all the systems have all the policies set at each site manually. It's not workable. Some systems propose special control messages to distribute that information. Certificate systems allow the adding of a layer of indirection (or layers) without requiring an online connection. They let you delegate tasks but they don't force you to. And they don't force everybody to delegate to the same person. There could be two or more parallel networks of root signers with their own delegates, each issuing control messages in their own way, and you could program your system to listen to one, both or none of them. The key is you only have to do that one configuration, if you want, and then other people do the rest.
Q: What about offline sites with UUCP feeds?
A: This system lets them have full authentication. Alternately, if you are a leaf node, and you trust your upstream feeding system's security, and it is doing the authentication, then you could take a feed with authentication stripped out, and still benefit from it.
Q: Isn't public key cryptography complex?
A: It is, but there are free libraries to do all the hard stuff. There are simpler solutions to authenticate cancel messages, Path: lines and a few other items, but you need a system like this to verify control messages at the least, in which case you might as well verify everything with it.
Q: What if a root signer goes rogue?
A: The other root signers, in concert, would have the power to issue a signed "key revocation" telling all sites that get it to remove that signer's key from the records. The annointing message would have a short expire time so the rogue root signer could not re-issue it.
Q: What about anonymous remailers?
A: A person with an anonymous address that sends mail back will probably be able to get a certificate. They can post without revealing their name, except presumably to the party running the anonymous remailer. Digital signature works fine to prove the same person sent two messages without saying at all who that person is in the real world.
It's also possible for a party to set up an anonymous certification service, providing certificates to people who can't even be E-mailed, though in the groups that insist that posts be replyable this would not be desired, obviously. Groups that encourage anonymous contribution and don't require replyable addresses could have participants with certificates even though the certifier doesn't know their real identity. The certifier has to know something about them if you want any checks on abuse, but it need not be their real identity.
How much anonymity we will have will depend on the groups. Some groups want none. Some want lots of it. Chances are certificates issued to anonymous posters will only work in a subset of groups.