The Path header shows the route a message took from its entry into the USENET system to the current system. It is a list of site identifiers with the origin on the right.
PATH-ENTRY = old_path / new_pathold_id = 1*( ALPHA / digit / '-' | '.' | '_') old_path = old_id *(punctuation old_id) punctuation = LWSP / %x21-2f / %x3a-40 / %x5b-60 / %x7b-7f new_delims = [FWS] ('@' / '/' / ',' ) [FWS] new_path = post_injection '%' pre_injection delim_plus_id = '!' [FWS] old_id / new_delims site_id
post_injection = *(site_id 1*punctuation) site_id pre_injection = site_id *delim_plus_id site_id = ALPHA word ; UUCP name / ALPHA ; for "x" tail entry / '.' word ; other registered name / <FQDN> ; as per RFC 1034 / <dotted-quad> ; numeric IP address rep ; specified in rfc820 etc. / '[ dotted-quad ']' / '[' <ipv6-numeric> ']' ; per RFC1884 word = 1*(ALPHA / digit / '-' / '_')
When a system receives a message from another system, it MUST add its own unique name (path-identity) and a delimiter to the beginning of the Path string. In addition, if needed, folding-whitespace may be added.
The path-identity added MUST be unique over all of USENET. To this end it should be one of:
Whichever form is chosen, a site SHOULD use a form which can be verified using one of the schemes described below by all sites to which it will forward news articles. If all forwarding is by NNTP or other internet based protocols, then the FQDN or IP address encodings are advised. For the purposes of comparison, FQDN entries should be put in an all-lower-case canonical form.
Because RFC1036 specified any punctuation or whitespace could act as delimiter, programs SHOULD accept this, with the exception that IPv6 addresses containing colons MUST be treated as a single unit. Modern programs MUST generate only the set "!,%@" plus optional additional whitespace.
When a site receives an article from another site, it SHOULD (MUST after Verified-Path-Date), verify the identity of the source site. When processing an article from a source, the leftmost entry of the Path line should be extracted, converted to a canonical form, and tested to see if it matches the canonical form of the verified identity of the source. If it does, a "," should be used as the delimiter, and thus the comma, and then the receiving site's path-identity MUST be prepended to the Path line. The method of verification is up to the site. Any method of suitable authenticity may be chosen, with the consideration that in the event of problems at the source site, the relaying site may be called upon to reliably identify it. Verification schemes for the most common forms of article transmission are described below.
If the leftmost entry does not match the verified identity of the source, then the receiving site should prepend an "@" delimiter, then a simple form of the verified identity of the source, then a "," delimiter and then the receiving site's own path-identity. This adding of two identities to the line should not be done if the provided and verified identities match. For articles received from an internet source, the 32 bit IPv4 address or properly verified FQDN, whichever is shorter, is encouraged for the generated ID.
For historical reasons, the rightmost entry in the Path string generated by most systems is not a site name, but a "user name." However, the Path string is not an E-mail address and MUST NOT be used to contact the user. Injecting agents MAY place any string here that is not a path-identity. If no meaning is anticiated the string "x" SHOULD be used.
RFC1036 suggested that the last entry could be a site name, requiring software to check it when feeding, but said it also should have a userid for very old systems. As of this specification, a systems SHOULD NOT treat the tail entry as a path-identity.
Typically this field will be the only entry on the Path string generated by a poster, or if not generated by the poster, by the injecting agent, which will prepend a "%" and then its own verifiable path-identity. The percent divides the verified part of the Path line from any entries provided prior to injection into the news network. There may be more than one entry after the percent, and all but the last are to be treated as sites.
Injectors SHOULD use the tail entry for local authentication information on the source of an article. For example, if they wish to store an encoding of the IP address of a source machine connecting to do the injection, and/or the UID of an invoking user or any other such information, they may encode it in the tail entry, provided they do so in a manner that will not match any site identifier. (e.g. ending with a dot.)
Any program which mails to these addresses MUST assure that no other system will send the same or a highly similar mail message. Ie. all steps MUST be taken to avoid program-detected errors causing mail messages to be sent from multiple sites.
Any mail to these addresses MUST have an "In-Reply-To" MAIL header with the Message-ID of the USENET message which inspired the mail.
See RFC 2142 for other addresses. It defines all these excepting "injector-trouble."
Those who see a problem on the net may take steps to inform those at the source of the problem, but they must think globally, and consider whether many others will be reporting the problem as well. It serves no purpose to innundate personal mailboxes with largely duplicate problem messages. In fact, it's ruder than many of the abuses being complained about.
Don't answer abuse with abuse of the trouble-reporting systems. Abuse of those systems just makes them less useful for dealing with future problems.
Aside from tracing the route articles take in moving over the network, Path is used primarily to allow relaying systems to not send articles to sites known to already have them, in particular the site they came from. This improves the efficiency of links, even ihave/sendme links.
When feeding an article, a relaying agent SHOULD check to see if the path-identity of the recipient site is present in the PATH line. If so, it SHOULD not feed the article to that site. When testing for a match, case should be ignored on entries in the FQDN form, but for legacy reasons, should be considered relevant on UUCP map entry forms.
Path is also used for USENET statistics gathering and flow tracking.
Finally the presence of a "%" delimiter in the Path header can be used to identify an article injected in conformance with this standard.
The Path header MUST not be truncated.
Whitespace may be present in the Path to make it easier to represent. However, there is no requirement to do so. Whitespace MUST not be used as a delimiter without another non-white delimiter also present, however older software may generate it. Any use of a delimeter other than comma (to the left of the percent sign) should be considered an unverified path entry.
A summary of delmiters and the meaning they imply for the name on the right, or in addition, the name to the left.
After the Verified Path deadline date, articles which contain unverified path entries MAY be rejected by other systems, and MAY cause an error message to go back to the poster or the USENET manager at the verified system prior to the unverified entry. (As with all error generating systems, steps MUST be taken to assure only one error message is received by the receiving party per error.)
Old USENET relaying and injecting programs almost all delimit Path: entries with the "!" delmiter, and these entries are not verified. As such, the presence of "%" as a delmiter will indicate the article was injected by software conforming to this standard, and the presence of "!" as a delimiter will indicate the message passed through systems developed prior to this standard. Prior to the Verified-Path-date, messages with mixed sets of delimiters will be common. After that date, all messages should have no "!" delmiters prior to the "%" delimiter.
Sites attempting to verify an incoming entry should take the following approaches for common transports. They are not required, but not following them may lead to wasteful double-entry Path addtions.
If the incoming article arrives through some protocol local to the site, such as UUCP, that protocol MUST include a means of verifying the article source site, and this should match. In UUCP implementations, commonly each incoming connection has a unique login name and password; that login name could be used to build a suitable verified identifier.
Here is an example of a suitable verification method for an article arriving via a TCP/IP protocol such as via NNTP:
There is no firm way to tell a path entry generated by new software, and one generated by old software assuming that any delmiter is valid. However, use of "!" by old software has become effectively universal.
Sites are not strictly required to use a standard form for their path entry, but if they don't, path lines out of that site get longer due to the adding of the identity. However, groups of associated sites wanting a common identity may decide to use that and let the receiver add the specific site.