Archive: Wikipedia

I was pretty excited to learn this week about Domesday Reloaded. The Domesday project aimed to take a snapshot of British life in 1986. 25 years on, the BBC are looking to update it to document the changes that have taken place since then.

I have been interested in the Domesday project for a while. The idea that a snapshot of Britain was taken, in the form of maps, photographs and text. Yet, the data was unavailable to most people.

The Domesday project was as much an ambitious experiment with technology as anything else. The technology was just about available, but a lot of pioneering work had to be done, and the hardware required for it was prohibitively expensive, leaving many of the contributors somewhat miffed.

Since then, it has become one of the most famous examples of digital obsolesence. This was due to a combination of the technology required to read the discs becoming increasingly rare, and idiosyncratic code.

The Domesday project came at a time when the technology was available, but the standards were not yet there to make it stable enough for long-term preservation, or even easy access in the short term. It’s a reminder that digital technologies are hugely enabling, yet frighteningly fragile.

Then there are the copyright issues surrounding both the content and the technology.

Joys of browsing Domesday Reloaded

The BBC should be applauded for finally managing to open up some of the data to the public on the web. The Domesday project was created before the web was invented. This isn’t how the content was designed to be viewed, so navigation is a bit cumbersome.

But aside from this gripe, the Domesday Reloaded website is turning out to be a fascinating resource.

I was born in 1986, the same year in which the Domesday project disc was published. So the Britain described here is a place that I don’t remember. But enough of it is familiar for it to feel incredibly relevant to me. It’s almost like being given a little upgrade to my memory, so that I can have snippets of knowledge from just before I was born.

Take the photographs for D-block GB-328000-690000 — the centre of Kirkcaldy, my hometown (D-block being one of the 4km by 3km areas the UK was divided into). It took me a little while to recognise “Kirkcaldy’s busy High Street”. But once I spotted British Home Stores, I was right there.

Yet, despite the familiarity, it is almost a completely different world. My memory of the High Street before it was pedestrianised is very limited. But it is just within touching distance of my memory for me to feel a strong connection with it.

The text entries are also fascinating. Most of the contributions were provided by primary schools. A decision was taken by the Domesday project not to edit the contributions, so the quality and style of writing varies from area to area.

As such, what strikes me the most is that it informs you as much about the prejudices of the school pupils and their teachers as it does about the area. It also retains their poor spelling and strange grammar.

For instance, an entry from Dundee (D-block GB-336000-732000) called ‘Traffic in and out’ is a basic survey of vehicles travelling on a road, with guesses as to where the vehicles are going and why. It lacks the academic rigour you would ideally want from a historical document.

But while some of the entries may seem banal, it was designed to be this way. The aim was to genuinely document society by capturing childrens’ curiosity with everything. This way it wouldn’t leave out what adults perceive as being obvious, when it wouldn’t necessarily be so obvious to someone in 1,000 years.

Missing D-blocks in Dundee on Domesday Reloaded

The really big shame is that not every part of Britain was documented. I could understand remote rural areas not being included. But sadly some highly populated areas have also been missed out. For instance, two D-blocks that cover the centre and east of Dundee lie blank, as does much of London.

But what exists is a joy. Even in the little amount of scanning I have done, I have already learned new information about the area I live in, which has set my mind racing and inspired me to investigate further.

Challenges for the modern day equivalents

What also struck me is how we actually already have readily-accessible modern-day equivalents of the Domesday project, almost by accident. The BBC is asking for users to update the content for D-blocks that were documented in 1986, to take an equivalent snapshot of 2011. I may go out and take some photographs for that.

But this sort of local information is staggeringly well documented already. We have Wikipedia, which can be edited by anyone but retains an academic approach that the Domesday project lacked. As such, it is a treasure trove of local information that can probably be relied on more.

Meanwhile, Google Earth and Google Maps provide masses of images of all corners of the country. It absolutely dwarfs what’s on Domesday Reloaded.

But the big question, which can’t be answered at the moment, is whether the wealth of information available on the web can be packaged up into a Domesday-style snapshot and preserved forever. The challenges of web preservation are massive.

Like the Domesday project, we could find the digital information almost slipping through our hands. The BBC know that themselves. With a stroke of a pen, it was decided that a significant chunk of British web heritage will be removed when the BBC removes some of its archived pages from the web.

Today’s XKCD led me to look at the Wikipedia article ‘List of numbers‘ out of curiosity.

I was surprised to see listed among the ‘notable integers’ were a few telephone numbers, such as 999 and 911. I guess these were included on the basis of their “cultural meanings”, although they are not integers. (They have since been removed from the article.)

But I was surprised that the most culturally significant telephone number of the past decade — that’s right, 118 — was left out. So I decided to fix that.

Notable integers, including 118

(Don’t worry, I only did this in Firebug, not on Wikipedia itself.)

I saw this story on Scotsman.com today about the Scottish Parliament’s Public Petitions Committee attempting to reach out by using social media. Of course, I am all for the correct use of social media as a sensible and low-cost way for any organisation to communicate with the public and to allow people to get in contact. But there was something about this story that just seemed odd.

HOLYROOD chiefs are to use blogs, Wikipedia and YouTube to make Parliament more accessible to the public, they said today.

People petitioning Parliament will be able to provide videos and photographs.

And Holyrood’s Public Petitions Committee is to have its own blog and Wikipedia page.

It’s the mention of Wikipedia — twice — that tweaked my antenna. How exactly does Parliament intend to “use Wikipedia” to become more accessible to the public? Perhaps they meant using wikis, and got that confused with Wikipedia.

I decided to delve a bit further in case The Scotsman got the wrong end of the stick (which, let us face it, is fairly likely). But the Scottish Parliament’s press release seemed even odder.

As from today blogging, Wikipedia and YouTube will be some of the new social media tools introduced by the Public Petitions Committee as part of its report publication. The report is the result of a year-long inquiry into improving awareness and participation in the public petitions process.

Petitioners will be able to provide videos and photos about their petitions as part of the committee’s new blog page. A podcast, Wikipedia page and dvd about the Parliament’s public petitions system all signal the committee’s commitment in encouraging access to and awareness of the petitions process. The committee also supports the creation of local petitioning systems with local authorities.

I was still confused, so I took a look at the Public Petitions Committee’s report to see what the plans actually were. You can read the details of its plans to use social media under the heading “E-Based” (paragraph 84 onwards).

In paragraph 119 the Public Petitions Committee says: “We are launching, alongside this report, a dedicated Public Petitions Committee Wiki page.” The footnote takes you to this Wikipedia article. This is an article which was already deleted when I checked it early this afternoon, and remains deleted as I write this article.

The Public Petitions Committee’s attempt to use Wikipedia like this completely misunderstands what Wikipedia is for. A page such as the one the Public Petitions Committee tried to create is completely against Wikipedia guidelines. Wikipedia is an encyclopedia, not some kind of worthy version of Craigslist. They could try reading about What Wikipedia is not, notably that Wikipedia is not a soapbox:

Wikipedia is not a soapbox, a battleground, or a vehicle for propaganda and advertising… Therefore, content hosted in Wikipedia is not… [p]ropaganda, advocacy, or recruitment of any kind, commercial, political, religious, or otherwise…

[Content hosted in Wikipedia is not] Self-promotion. It can be tempting to write about yourself or projects in which you have a strong personal involvement. However, do remember that the standards for encyclopedic articles apply to such pages just like any other, including the requirement to maintain a neutral point of view, which is difficult when writing about yourself or about projects close to you.

An subject is considered worthy of an article on Wikipedia by the bottom-up processes upon which Wikipedia is based. It is not for the Public Petitions Committee to swan in and create a page for itself. Nor can it be the final arbiter on what that article contains. The report somewhat states in somewhat Orwellian fashion:

We are of course mindful of the ability to amend text given the ‘ongoing principle’ under which Wiki pages are created. Our clerks will monitor the page carefully to ensure it remains a factual and authoritative source of information about our public petitions process.

Moreover, Wikipedia is not a manual, guidebook, textbook or scientific journal:

Wikipedia is an encyclopedic reference, not an instruction manual, guidebook, or textbook. Wikipedia articles should not read like… Internet guides. Wikipedia articles should not exist only to describe the nature, appearance or services a website offers, but should describe the site in an encyclopedic manner, offering detail on a website’s achievements, impact or historical significance…

In paragraph 109, the Public Petitions Committee itself says of its attempts to use social media that it is “not seen as ticking a box which says ‘look, we are doing this because everyone else is!’”. But this Wikipedia stunt has box-ticking written all over it. It has Dad-dancing written all over it.

I’m sure using Wikipedia to publicise the Scottish Parliament’s petitions process seemed like a good suggestion in a meeting room somewhere. But they could have done with having a bit more of an understanding of what Wikipedia actually is before actually proceeding with the idea.

Luckily, the Public Petitions Committee didn’t put all of its eggs in one basket. There will also be a “pod cast”, which currently seems to be a solitary MP3, tucked away at the bottom of the press release. Other than that, there is a promise to link to the Scottish Parliament’s own podcasts. There is no RSS feed and no option to subscribe.

Let’s look it up on the Public Petitions Committee’s new best friend Wikipedia. The article for Podcast is currently illustrated with a massive RSS icon. It says:

A podcast is a series of digital media files, usually either digital audio or video, that is made available for download via web syndication. The syndication aspect of the delivery is what differentiates podcasts from other ways of accessing files, such as simple download or streaming: it means that special client software applications known as podcatchers (such as Apple Inc.’s iTunes or Nullsoft’s Winamp) can automatically identify and retrieve new files in a series when they are made available, by accessing a centrally-maintained web feed that lists all files currently associated with that particular podcast. The files thus automatically downloaded are then stored locally on the user’s computer or other device, for offline use.

I therefore await the launch of some actual podcasts, not just MP3s branded as “pod casts”.

The Public Petitions Committee will also have a “blog page”. That can be found here and, in fairness, it doesn’t look all that bad. It looks like a good way to highlight the work of the Public Petitions Committee.

I think organisations like the Scottish Parliament should be using social media and web technologies more. So the Public Petitions Committee’s steps in this direction are welcome. The blog looks particularly promising.

But engaging with the public is about so much more than tossing around buzzwords like ‘Wikipedia’, ‘YouTube’ and ‘podcasts’. A proper understanding of social media would provide a better service to the public and waste fewer resources.

Last week I was in the pub talking to a friend and we were talking about blogging. This person doesn’t know much about it, but he knows that I’m heavily interested in it. (NB. This person is a Labour Party supporter, which explains a lot.)

He asked me a really strange question. “So, who is it that’s in charge of blogging then?”

“What do you mean, ‘in charge’?”

“Well, there must be someone who’s behind it all.”

“What do you mean? No! It’s something that you do yourself! Anyone can set up a blog.”

I actually had to explain to him that there is no overlord that looks after the blogosphere. There is no official process. You don’t have to ask anyone’s permission to set up a blog.

And that’s the way it should be, right? Blogging — and, indeed, the internet as a whole — is fundamentally a medium of freedom. Blogging is about many of the things we value the most about freedom — of speech, protest, association. And for many oppressed people in this world who would otherwise not be allowed to express themselves, blogging offers the chance to speak out to a wide audience.

The day you have to ask permission to blog is the day you have to ask permission to express an opinion. (Of course, thanks to our friends in the Labour Government, you already do have to ask permission to express your opinion in this country — but that is a whole new blog post.) What amazes me is not just that some people think that’s the way it should be. It that they think it’s the way it already is and are so unconcerned about it.

Still, at least we know it’s not going to happen, right? Right?

Actually, no. Some poisonous person called Marianne Mikko wants to put a stop to all of that “expressing your opinion” nonsense. Marianno Mikko is an Estonian centre-left MEP. It would be someone on the left, wouldn’t it? If anyone asks me why I don’t see myself as being on the left, it is because the left contains people like this.

Here is what she has to say: “the blogosphere has so far been a haven of good intentions and relatively honest dealing. However, with blogs becoming commonplace, less principled people will want to use them”.

Clairwil’s sarcastic response is the only sensible one: “Oh God! I hate ‘less principled’ bloggers!”

And the solution for stopping less principled people from having a blog? Why, red tape of course!

I think the public is still very trusting towards blogs, it is still seen as sincere. And it should remain sincere. For that we need a quality mark, a disclosure of who is really writing and why.

It’s interesting that Ms Mikko thinks that the public trusts blogs, because it doesn’t seem that way to me. Take the aversion that many people have to Wikipedia. “You can’t trust that, you know — anyone can edit it,” they say. That is despite the fact that it contains few more errors than Encyclopædia Britannica does. You hear much the same things about bloggers. They’re not to be trusted. (Of course, the mainstream media is responsible and measured in all of its output!)

That’s just the beginning though. Here is what German ‘Liberal’ Jorgo Chatzimarkakis — a member of Germany’s “Free Democratic Party” — has to say:

bloggers cannot automatically be considered a threat, but imagine pressure groups, professional interests or any other groups using blogs to pass on their message.

Just imagine it! Imagine all those pressure groups. Imagine any other groups! All using tools to communicate with people! Isn’t it just shocking?

Mr Chatzimarkakis continues that blogs “can be seen as a threat”. A threat to what? His job? Then good! Honestly. If this is the sort of thing that comes out of Germany’s “Free Democratic” Party, I dread to think of the illiberal nonsense the other parties come out with.

The thing about it is that you are perfectly welcome to choose which blogs you trust and which you don’t. For me, there are of course some blogs that I trust more than others. I am happy with the decisions I make in this regard. And if it turns out I was wrong about a blog then I just change my mind. Easy.

So what on earth is this ‘quality mark’ nonsense all about? Do these people really think that we are unable to decide for ourselves what we can read on the internet? If these people get their way, soon enough the government will be telling us what to read. If the government tells me to read something though, that is a sure fire sign that I ought to steer clear of it.

Quality mark? Sounds more like skid mark to me.

This might be laughed off by some. But the fact that there are politicians even talking about this is enough to make my blood boil. How can these people have such scant regard for a fundamental right such as freedom of speech?

And, via the comments at The Devil’s Kitchen, it appears as though in Italy they are at an advanced stage of legislation requiring people to register their blogs. Not only that, they would have to pay a tax as well!

The Levi-Prodi law lays out that anyone with a blog or a website has to register it with the ROC, a register of the Communications Authority, produce certificates, pay a tax, even if they provide information without any intention to make money… the Levi-Prodi law obliges anyone who has a website or a blog to get a publishing company and to have a journalist who is on the register of professionals as the responsible director.
99% would close down.

Jesus Shite! Are we really headed down this road?

Some bloggers are in a flap at the moment because Google has seemingly manually downgraded the PageRank of some websites. The reason appears to be that the websites in question sell paid links.

Some of the websites in question are pretty big. Washington Post, Engadget, Weblog Tools Collection, Joystiq, Problogger.

This blog also sells text links, although I don’t think I’ve been hit by Google’s bitch-slapper. My PageRank at the moment is 5, which I think is what it was before. I don’t actually know, because I don’t really care about my PageRank as much as, say, my Technorati authority or the number of visitors.

Anyway. There are text link adverts on this blog. I was aware that the people who were buying the links were almost certainly more concerned about “buying” a better ranking on Google than something such as click through rates or trying to reach out to the readers of this blog.

But I hate to see junk results on Google, for sure. But do I feel guilty about selling links that contribute to this? No. It is individually rational for me to sell these links, despite the fact that I detest the method.

Why? Because if I am selling the links, I make money from them. If I am not selling the links, Google results are still equally junky because so many other people are doing the same thing. So I have two choices. Either I live with junky Google results and make no money, or I live with junky Google results and make some money. It’s a no-brainer.

Funny, though, how the changes leave Google AdSense completely unaffected! What a coincidence. When you look at how Text Link Ads (probably AdSense’s only real competitor) has been penalised to hell by Google, it begins to look like hypocrisy at best and a powerful Google using its might for “evil” means at worst.

However, it is understandable if Google takes a hard-line stance. They strive to have the best search engine on the internet, so of course they will do everything in their power to stop the “sale of PageRank”.

Their latest moves probably change the landscape a bit. It might put some advertisers off, but I doubt it will put any webmasters off. For as long as the webmasters make one penny more by selling adverts than by not, they will continue to sell adverts.

Of course, the reduction in PageRank could mean fewer people visiting via Google’s search engine. But I doubt many webmasters will be licking their wounds over that. From my point of view, for sure Google accounts for about two thirds of visitors to this website. But that is the least valuable two thirds (I don’t mean ‘valuable’ in monetary terms here, I mean in terms of their contribution to the website).

People who visit this website via Google view fewer pages than an average visitor. They are more likely to take one look at one page and then swiftly leave, never to be seen again. They spend an astonishing 30% less time on this blog than the average visitor. According to Rhys, he gets hardly any visitors from Google in the first place.

So if my PageRank takes a battering, I won’t be too bothered about it. Because Google provides none of the things that I value about blogging. Regular visitors are more likely to come via a link on another blog. And the best comments come from regular readers rather than the flash-in-the-pan visitors who might leave personal abuse then exit and forget all about this blog.

Come to think of it, I am the same when I use Google. I never expect to find the best websites by going to Google. If something is worth reading, I am likely to hear of it by word of mouth, either by reading other blogs or via links from my friends on Delicious, or whatever.

Meanwhile, if I want specific information, I am much more likely to search for it on Wikipedia rather than Google. Wikipedia might not be 100% reliable, but Google’s reliability is surely even worse. If I want a primer on any topic, Wikipedia usually gives me what I want.

What’s more, the links on Wikipedia are usually more relevant. Spam links are swiftly removed by the community of users. How many times has Wikipedia led you to a link farm compared to Google?

If I want information on a band I go to either Last.fm or Discogs. If I want to look up a word I use either Dictionary.com, Chambers of Urban Dictionary. Etc, etc. I know I still sometimes use Google, but what for? I can’t remember the last time I used Google search as anything except a last resort.