hckr.fyi // thoughts

Newsletters are Walling Off Your Freedom

by Michael Szul on

I often forget a lot of "tech pundits" come from the finance sector and it's not so much an expertise in technology as it is an expertise in "applied technology"—wherein "applied" relates to capitalism with a potential skew towards financial extraction. This disconnect on my part—having spent over 20 years knee deep in programming—leads to the occasional disagreement with some of these pundits. It happened with Ben Evans over virtual reality. It also happened with Azeem Azhar over Bitcoin.

Recently Azhar complained (in his paid newsletter) about only having Facebook for an ID to log into his Oculus. I brought up how this opens people up to data collection, and he was dismissive of my point—not caring about data collection, but instead upset that he had to use a proprietary ID of a different service to log into the device. He believed that his data didn't matter.

We should not waste too many cycles on this. The real problem is that I can't use the Oculus without having a Facebook ID. And that Facebook has too much market power. My Oculus data is irrelevant, tbh

I don't want to discount Azhar's points on the power of Facebook and the unnecessary usage of a Facebook ID, but his response about his data and Facebook's data collection habits is at odds reality. My response:

Although I agree with the market power and login/ID statement to believe that any data is irrelevant is naive. The Oculus Quest guardian is literally a map of an area of your home. How many of those data points are being sent back to Facebook--a company well-known for data collection and privacy issues? Pre-COVID people rejected Facebook Portal for this very reason.

My point wasn't to disparage the Oculus, which is a fine device, but the very reason you have to use your Facebook ID is for greater connectivity between Facebook accounts and services for the very purpose of better data sharing and data collection. Your Oculus data is not irrelevant. It's being merged with all the other data points that Facebook is collecting in order to create a profile with which to better sell you advertising (among other reasons).

But a conversation like this is lost in the comments section of a paid newsletter. It's unquotable unless externalized like above. Azhar uses Substack, which is also being used by a plethora of modern journalists, pundits, and hobbyists in order to create a "noise-free" avenue of direct communication. It also helps that it has an easy subscription and payment mechanism.

You may have noticed the increase in email-first communications and paid newsletters over the course of the last few years—especially as more journalists jump ship from traditional news magazines, newspapers, and web sites into a more self-directed writing approach. When working on the Codepunk project with Bill Ahern, we even put out a Bots + Beer newsletter as a secondary source of content for a time (non-paid, of course).

But the novelty wore thin for a few reasons—namely because we were placing efforts on content for a specific audience (newsletter subscribers) that wasn't making it to the hundreds of thousands of people that read Codepunk. Our first transition was to start putting that content online. But with a Bots + Beer web site we were now focusing on two different web sites filled with content instead of one (with the former having markedly more readers). It didn't make sense, so we merged the two, and now the web site hosts the primary content for readers, subscribers, and anyone that searches via a search engine, and the newsletter points people in the right direction in case they missed the content.

That last part should be completely unnecessary given modern technology and standards.

But first, a history lesson.


The Internet in its post-infancy, but pre-commercialization days was a collection of protocols providing complimentary services. Most today are familiar with HTTP—protocol of the world wide web—and email, but other protocols like Finger, Usenet, and Gopher provided other ways to communicate and disseminate information. And of course, if you're really a veteran of the Internet space, you remember bulletin board systems (BBS).

Although the web ultimately won out (with the exception of email, which has become ubiquitous with daily communication), other protocols still existed on the periphery.

As web sites moved beyond the hundred and thousands and into the millions, it became harder to disseminate information. This led to Internet directories, LISTSERVs, and eventually search engines, but only the LISTSERVs captured the community feel of the "communication revolution."


LISTSERV is a trademarked name for a specific piece of mailing list software, but it has come to refer to—probably unfairly—mailing list software in general, such Sympa or GNU Mailman. These automated mailing list applications started to emerge in the mid-80's as a way to facilitate communication across large disparate groups where many subscribers hadn't met or even communicated outside of cyberspace.

Electronic mailing lists existed before LISTSERV, but those were largely maintained manually and didn't have the automated pass-through of messages—instead relying on email addresses in totality or aliases. Most of the early electronic mailing lists were for academic and military purposes, but soon were adopted by early computer enthusiasts. In the vast wilderness of cyberspace—while web sites were still young—email and mailing lists becames the primary tool for delivering information to collaborators.


As web sites began to birth weblogs (blogs), regularly published thoughts and information became more prevalent, including timely news, but how do you get timely information to potential readers? Mailing lists were great, but depending on the communication frequency, the email lists could get overwhelmed with messages, causing you to lose information, replies, etc. But with the emergence of blogging software, two competing standards surfaced to remediate this issue: RSS and later Atom. These standards were meant to be informative, uniform standards for providing a subscription model for structured information (e.g, blogs), and allowed for the consumption of multiple blogs from a single software source.

(Really Simple Syndication) RSS is a way for website authors to publish notifications of new content on their website. This content may include newscasts, blog posts, weather reports, and podcasts. 1

Structured information is not new on the web. The idea was circulating as far back as 1995 with Apple Computers. The first version of RSS (known as RDF Site Summary) appeared in the late 1990's, primarily driven by the Netscape browser. When RSS 0.9 transformed into RSS 0.91, the format was simplified and diverged from a traditional RDF structure. RSS then stood for Rich Site Summary.

RDF (or Resource Description Framework) is an XML-based information description language used for describing structured data in subject–predicate–object notation known as triples. Today, you see this most often in meta data parsing applications that require querying technologies like SparQL.

When Netscape was purchased by AOL, AOL abandoned RSS support and the community split into two camps: UserLand (company of blogging pioneer Dave Winer) and the RSS-DEV working group. O'Reilly Media representatives worked together with several other organizations and individuals in this RSS-DEV working group to further the specification, while UserLand attempted to copyright the format and trademark the RSS name (which was rejected by the US Patent Office).

Dave Winer's company was one of the first to concentrate on content management and blogging, and he had originally developed his own syndication format—ScriptingNews, named after his blog. When RSS 0.91 was released, he replaced his syndication specification with the new RSS version.2

The RSS-DEV working group re-injected RDF into the specification with version 1.0, which also added support for XML namespaces.

Winer, meanwhile, added enclosures to his competing specification, introducing RSS 0.92. These enclosure elements allowed for RSS to ultimately be used for podcast feeds.

With RSS 2.0, Winer re-dubbed the specification Really Simple Syndication, while now also introducing XML namespaces to the specification.

Of course, despite the advancements, neither Winer nor the RSS-DEV working group could lay claim to being the official publishers of RSS. This left the world with two competing specifications with the same name… and no official stance of which is the official one to follow.

Mostly, this was a disagreement on what RSS was actually for:

[There] was eventually contention between a “Let’s Build the Semantic Web” group and “Let’s Make This Simple for People to Author” group […] 2

This confusion and the convoluted history of RSS led to the creation of the Atom specification, meant to be a clean slate for syndication, and for a time Atom was a relatively popular alternative.

Winer eventually assigned his "copyrights" to the Berkman Klien Center for Internet & Society in 2004 and launched an RSS Advisory Board to clean up ambiguities in the specification.

Unfortunately, multiple versions of the specification still exist today, but largely, RSS 2.0 has been the most widespread version of the specification used (including over Atom), and represents more than 50% of the current versions published on the Internet.


The fight for RSS was not a fight for capitalist control—Netscape had already abandoned the format when AOL took over, and over the course of the last few decades, most browsers, publishers, and social media companies have abandoned the standardized format in order to lock users into their own proprietary feed offerings.

Instead, the RSS fight was a philosophical one over how best to represent structured web information. On the frontline of this fight was a 14-year old boy named Aaron Swartz. Swartz participated in the early RSS-DEV working group before anyone realized how young he was. Although his contributions were to the RSS 1.0 specification that ultimately didn't win out, his push was namely for the inclusion of namespaces in the XML format for adding additional components outside of the original specification. This inclusion of namespaces was eventually adopted in the Winer version of RSS (and is included today), and is the primary reason why iTunes has its own namespace elements required for iTunes Podcast Connect inclusion.

From the software perspective, Swartz is best known for his work as a founder of Reddit (yes, a founder— no matter what Steve Huffman says); however, even as early as 14, he was concerned with the culture of the Internet, freedom of information, and social justice. Despite Aaron Swartz's contributions to the history of the Internet, eventually he abandoned Reddit—not appreciating the direction Conde Naste was taking the web site and coming to several disagreements with fellow Reddit founders. This driving desire towards freedom of information, led him to participate in the development of several tools (e.g., SecureDrop) to help with press freedoms, and his social justice attitude prompted him to found DemandProgress.

Aaron Swartz from https://commons.wikimedia.org/wiki/File:Aaron_Swartz_profile.jpg

A child of the free Internet, Swartz began to have issues with the way scientific journals were controlling access to what was mostly publicly funded research.

He exclaimed in his Guerrilla Open Access Manifesto that:

Information is power. But like all power, there are those who want to keep it for themselves. The world's entire scientific and cultural heritage, published over centuries in books and journals, is increasingly being digitized and locked up by a handful of private corporations. Want to read the papers featuring the most famous results of the sciences? You'll need to send enormous amounts to publishers like Reed Elsevier.

Adhering to this guerrilla principle, while attending Harvard on a fellowship and visiting MIT, Swartz installed a laptop in a networking closet and began downloading research papers via the JSTOR digital repository—a repository he had access to through his fellowship.3 In the on-going pursuit of criminal charges, many questioned the zeal with which the federal government pursued Swartz:

On July 14, 2011, Swartz was first charged based on a four count indictment, which alleged that he had improperly downloaded some four million academic journal articles from JSTOR, a data-based website that is available by subscription only. (Subscriptions are expensive and are typically purchased by academic institutions, rather than individuals.) On September 12, 2012, the prosecutor upped the stakes further, with the grand jury handing down a new and replacement thirteen-count indictment based on the same basic facts.

[…]

From the initial statements of Boston U.S. Attorney Carmen Ortiz, who made the highly debatable statement that “Stealing is stealing, whether you use a computer command or a crowbar,” to the upping of the charges from four counts in 2011 to thirteen counts in 2012, this was heavy-handed treatment for a 25-year-old information activist.

In fact, the backtracking defense mounted by the Boston U.S. Attorney’s Office of its own conduct in the Swartz case reveals that U.S. Attorney Carmen Ortiz now understands that they were using a sledgehammer for something that was merely worthy of a slap on the wrist, apparently along with her assistant, career federal prosecutor Stephen Heymann. For example, CNN reports that Ortiz’s office now says the prosecutors had no evidence that Swartz had acted for personal gain, and they apparently concede that calling for the harshest penalties available under the law was not appropriate, given his alleged offenses.4

Instead of a tempered pursuit, prosecutor overcharging led to the possibility of 35 years in prison and $1 million in fines. On January 11th, Aaron Swartz was found hanged in his apartment… Robert Swartz—Aaron's father—exclaimed:

Aaron was killed by the government […]5

The death of this freedom of information advocate reverberates even today as the fallout from the handling of his prosecution has enveloped the political game and helped changed the fate of Carmen Ortiz. Consider this piece recently from the Intercept:

The last time Joe Biden was in the White House, Boston Mayor Marty Walsh seemed an unlikely nominee for a future labor secretary. Carmen Ortiz, President Barack Obama’s U.S. attorney for the District of Massachusetts, had Walsh in her crosshairs. One summer dawn in 2016 she sent FBI agents to arrest two of his staff under a federal racketeering indictment.

[…]

Fortunately for Walsh, by then Ortiz had lost strong allies and upset powerful enemies. U.S. Sen. Ted Kennedy, D-Mass., passed away after championing Ortiz for her U.S. attorney role. U.S. Attorney General Eric Holder, who toiled beside Ortiz in their early days at the Justice Department, had returned to the private sector, dogged by a high-profile Ortiz prosecution gone wrong.

[…]

Between Kennedy’s funeral and Holder’s exile, Ortiz and her then-chief of cybercrime, Stephen Heymann, indicted internet freedom activist Aaron Swartz on 14 felony counts for allegedly downloading too many academic journal articles. Swartz had used a simple script to download academic journal articles from the platform JSTOR, which provided its articles free to anyone on the MIT network. It’s not clear Swartz even violated the company’s terms of service; finding a crime anywhere in what he did took an awfully creative prosecutor.

[…]

An expert defense witness who never got to testify about MIT’s site license for JSTOR for “unlimited” use of the JSTOR library containing the articles would have made the conviction that much more difficult. The articles themselves were produced at taxpayer expense but then paywalled to limit taxpayer access. Neither MIT, where Swartz registered his laptop for the downloads, nor JSTOR wanted to press charges. And Swartz was well known for his work on the RSS standard, helping seed RECAP (now CourtListener), as well as for the credits he’d earned as a co-founder of DemandProgress, Creative Commons, and Reddit.

Those facts seemed immaterial to Ortiz, who told the press after the July 2011 indictment: “Stealing is stealing,” she said, “whether you use a computer command or a crowbar, and whether you take documents, data or dollars.”

Swartz likely intended to republish the articles for free, and Ortiz made sure that would never happen. She and Heymann pushed for a maximum federal prison sentence of 35 years.

Looking to avoid a trial, Heymann compared Swartz to a rapist. By refusing to plead guilty, the line went, Swartz had “revictimized” MIT. Swartz fervently resisted, but Ortiz and Heymann had a trump card.

[…]

Within days of Swartz’s death, over 61,000 people digitally signed a White House petition to fire Ortiz — a singular distinction for a U.S. attorney. The Senate and House judiciary committees pilloried her.

Swartz was a pioneer of Internet freedom taken down by both capitalist and governmental overreach… and all he wanted to do was make information free.


This was a long and twisting road to get back to the concept of paid newsletters, but the driving factor is the commercialization of communication and information. Douglas Ruskoff believed that the communication revolution was re-christened the "information revolution" because authentic communication between individuals was free, but information could be commoditized and sold. The Internet started as a new frontier of mutualism and open protocols, and it seems at each crossroads, somebody is trying to stamp out or control the protocols and processes that enrich people through freedoms of the Internet. Swartz committed suicide after having the book thrown at him for audaciously believing that academic research that furthers humankind ought not to be locked away for only those with enough finances to pay for it. This after participating in a multitude of Internet ventures pursuing open communications and free information. Participating in RSS, Swartz was collaborating with Internet pioneers trying to make structured data easily accessible and shareable regardless of platform. Unfortunately, RSS hit roadblocks (some self-imposed) over the lack of corporate adoption—and even the rescinding of corporate support—mostly so those same corporations could push their own locked-in protocol—all to drive growth.

RSS peaked in popularity with Google Reader, but when Google shut down it's Reader software, it removed the largest aggregator from the market, forcing people to look elsewhere. Elsewhere led to less complete software, additional logins, and friction between everyday users and their software. This prevention of ease-of-use, limited (and even reduced) the appeal of RSS. Many web sites that are RSS capable today don't even publicize their RSS feed. You have go hunting in the code.

Without the ease of use, everyday individuals retreated from RSS and it is not coincidental that the decline wasn't missed—mostly thanks to the rise of social media, which replaced your personally curated RSS feeds with advertising-enhanced algorithms… and all the vitriol that comes with social media.

Drew Austin once wrote about email being a way to retreat from the algorithm and many people found comfort in a self-publishing model to deliver information directly to readers again—only instead of interactive LISTSERVs, we're left with single-direction newsletters.

Single-direction was always there, but it is now the preferred newsletter distribution model.

Substack didn't invent the newsletter, but they did a good job of partnering with major names in the beginning to push their platform into the forefront. If you look at Substack from the web application side, it's a blog with comments. The difference is that it allows for subscriptions and each post can be sent out in an email. We can call it a newsletter, but really it's a combination of a blog with email capabilities.

Substack isn't the only platform, but between Substack, Revue (Twitter's recent purchase), and other platforms, these email subscriptions have taken off with names like Azhar, Glenn Greenwald, and Matt Taibbi jumping on-board. For some—like Greenwald—it presents a platform for communication devoid of the censorship he experienced at news organizations (including his own Intercept). For others (e.g., Azeem Azhar, Ben Evans, Ben Thompson) it's a platform to generate revenue for his business analysis.

The problem with these newsletter platforms is that they are essentially walling off content behind paywalls, subscription logins, and mass email providers. Some of these platforms give you a window into the content… before shutting that window in your face. People retreated to email because we failed to give them a viable alternative.

This isn't rant against paid newsletters. I personally pay for several newsletter, paid web sites, Patreon accounts, etc. It's a rant against the format. Keep paying writers and artists for content. They deserve it.

On top of the problem of a walled Internet, I'm left with an inbox filled with newsletters that I now need to organize in subfolders to represent topics, priorities, etc. Although email management has gotten better, only the protocols associated with emails are a standard, not inbox design, labeling, tags, or other such features. Most email clients compete with each other on proprietary features, not the best implementation of the standard. Furthermore, those feature could differ between email clients and devices.

RSS readers, on the other hand, implement a standard, and although readers can certainly implement their own proprietary functionality, the extensible nature of XML and XML namespaces allows these features to be documented within the RSS feed so that other readers can implement parity, if appropriate.

But namespaces… what are those? Most people don't realize that RSS is a subscription model or that podcasts are actually served via RSS. I'm pained when I see web sites list their Spotify and iTunes link for their podcast, but don't actually list their RSS feed for people who want to listen to the podcast in their own podcast application. Most other applications have to rely on auto-discovery via a meta tag in the web site… but sometimes these tags are wrong.

In fact, just have a glance at the explanation of RSS in the LifeWire article we quoted earlier. It slowly devolves into an collection of gibberish (for the everyday person).

The biggest issue with RSS is that the only way people have found to monetize them is to truncate content and place the rest behind a web site paywall. Others have created RSS feeds that provide a token to subscribers (some podcasts do this with private feeds). A newsletter is always in your inbox, which has become ubiquitous with daily life, and that inbox is, in theory, private—unless you forward the email. RSS feeds require an application to honor authentication, or blogs need to provide specific (and private RSS) feeds. I was never a fan of truncating content.

The other problem RSS has is the aforementioned ubiquity. Even your grandmother has an email address. It's one of the oldest protocols and it's ingrained with both business and personal interactions. It's more common than having a social media account. But for most people, RSS is still a mystery shrouded in techno-babble. Until we make RSS subcriptions as easy as emails, we'll continue to wall off the Internet behind social media algorithms and newsletter subscriptions. Until we make it easy, we'll continue to wall off the Internet behind closed windows, proprietary corporate protocols, and decisions that only benefit financial extraction.