Media translation and the issues that arise

Having read various Chinese novels that had been translated into English, it was clear some translations were better than others in regards to the type of translation. It poses various concerns including subjective interpretation, in which the original story is unable to be reproduced in another language effectively. Some translators care for the content and form of the text to be the importance of the text, whereas others focused on the message it portrays, or the syntax, punctuation and rhythm, but the biggest problem of all was that some meanings and words cannot be translated altogether. The Chinese language considers the English to be a simple language, and as a result, there just aren’t the words to describe effectively many situations. Similar issues, as Katherine Hayles points out, are placed on the digitising of texts.

The biggest problem to many in media translation is the loss of meaning. Meaning for me is not just the words on the page, and the story it creates, but also the meaning that comes with the physicality of the book. The physical object and material of the book itself is in some cases the only reason a book has any meaning.

But not just looking at the physical book itself, there is a sense of originality and authenticity that would be lost in translating a document into electronic text. The significance a book possesses could lie in the font it is written in, or the illustrations or diagrams. In the Nehemiah Grew, the text we digitised in our lesson had all of these factors, as well as the editor’s corrections, and this is what made the book so captivating and interesting. Therefore, my essay will focus more on these issues of translation, and whether or not we can overcome them.



Essay Blog Post: The (New) Translator’s Task?

In her book My Mother was a Computer: Digital Subject and Literary Texts N. Katherine Hayles writes the following: ‘I use the term “media translation” to suggest that recreating a text in another medium is so significant a change that it is analogous to translating from one language to another’ (p.109).

Hayles is suggesting that the act of translating texts from one language to another is synonymous with the act of remediating a text from the physical codex into a digital one, in other words, by creating a digital edition of a physical text. The heart of my essay responds to this claim by exploring the challenges faced by a digital scholarly editor when remediating a text to see if they run parallel to challenges faced by the translator of languages.

The theoretical discussion around the challenge of a translator being dependent on its target (if the aim is to replicate the text as closely as possible or reimagine it for an intended audience) supports Hayles’ claim, but discussing this theoretically only would be limiting as the remediation of texts and translating languages is practical, too. My essay applies these challenges to my first-hand experience of the digital remediation process of Reynard the Fox.

As a result of this, the decisions I had to make for my own translation of text are not dissimilar to those faced by translators of language, and so the theoretical arguments and the practical process behind digital remediation construct an argument in support of Hayles’ claim in my essay.

‘Lost’ Shelley text online.

This, from Faith Binckes:

This pamphlet– Shelley’s Poetical Essay on the Existing State of Things (1811)– was essentially lost until 2006. As of last month, it is freely available on the Bodleian library website.  It’s a great opportunity to use your newly-acquired book-historical skills to explore this very rare material object (albeit in two-dimensional form). It’s an equally good example of how presenting and framing a text digitally can lead your reader/s into a whole set of related issues, and how decisions regarding which issues to choose shape interpretation in particular directions. You also get a nice introduction from Vanessa Redgrave.

How are literary texts preserved, disseminated and displayed on the internet?

By Christine Bradley and Lorien Kaack

Increasingly over time, the internet has caused more and more anxiety over the preservation of important information, how it is displayed and disseminated. Over the past few weeks we have been looking at authors such as Roy Rosenzweig, Robert Sloan and Alan Liu, who have written on these anxieties, Rosenzweig notably having a more negative tone than Liu, who sees the internet and its progression as positive.

Rosenzweig speaks of his anxiety of data on the internet and how it can be preserved. He states, ‘Ignacio’s sudden deletion of Bert should capture our interest as historians since it dramatically illustrates the fragility of evidence in the digital era’ [1]. He is scared of important information that is stored on the internet being lost, where most data is subject to bitrot, and has a life expectancy of 10 years. We can’t keep saving all this information, so how do we choose what to preserve and what not to preserve? ‘The most calibrated mix of technical solutions will not save the past for the future because the problems are much more than technical and involve difficult social, political and organisational questions of authenticity, ownership and responsibility’ [2].

The character, Kat in Mr Penumbra’s 24 hour Bookstore is very optimistic. She has complete faith in the idea of the whole of the internet being stored within containers, and seemingly is rather zealous and proud of this, however people such as Rosenzweig are very resistant about this idea of keeping everything on the internet in one place. Can it all be preserved, and if not, how do we make the decision of what to preserve?

The idea of displaying information on the internet can be shown through a digital literary project.  A digital literary project is a collection of information which has been manipulated and made machine readable in some way, and then made human viewable in some form:

It requires 3 stages :




Each of these three processes contain important information and decisions about the scholarly project you are undertaking and what you want to do with it. These include intellectual decisions that shape the information and what you want it to be used for.

By cataloguing a source you are grouping it into one category and not another, similarly by transcribing it you’re separating the text from its context. All of these things are necessary if you are going to make a digital project but all of them create a certain authority and limitations, which can help you understand the project by breaking it down this way.   

An example of this is the END (Early Novels Database):


  • University of Pennsylvania Library. 1200 Novels 1660- 1830

         We don’t see the actual text it is just the metadata of the text so publication             date, title etc…they use lots of info.


  • Organized metadata into specific fields


  • Available for people to interact with  

For this project the editors have decided only to give metadata on the texts, rather than giving you the text itself. This is important as it shows what they want the webpage to be specifically used for. Rather than putting all the information of the books on the database, they have chosen a certain category that they wish to display and this will determine why and who it is used by. On the END website ‘About’ page they describe their reasoning for displaying the information this way as: ‘The END (Early Novels Database) Project creates super-rich metadata about fiction in English in order to help researchers imagine new histories for the novel’ [3].  This also being a way to show how these early novels organise themselves and ‘about how early novels instruct readers about themselves, carefully describing prefaces, introductions, and dedications; tables of contents, indexes; title-page genre terms and footnotes buried deep within the text’ [4].

Screen Shot 2015-11-20 at 14.04.51



[1] Rosenzweig, Roy. ‘Scarcity or Abundance? Preserving the Past in a Digital Era.’ The American Historical Review, 108, (3), 2003, p. 736.


[2] Ibid, p. 747.


[3] Early Novels Database [Online] Available from:


[4] Early Novels Database [Online] Available from:


What makes a book worthy of preservation?

By Sophie Lee, Erin Brown and Jimmy Barton.

What makes a book worthy of preservation?

Commercial reasons

The most obvious answer would, of course, be commercial gain. The popularity of a text goes a long way in determining whether it is re published; a method of preserving the text, although not the physical work.

Historical importance

If a text has particular historical importance, for example if it still contains writing and annotations in it from the 17th or 18th century, then this would also usually be seen as an important piece of work to preserve. This, of course, would be preservation in the physical sense as well as preservation of the words in the text.

To an extent, both the text and the physical copy can be preserved digitally, thanks to scanning and encoding, however only an image of the physical text would be preserved, rather than the physical text itself.

Preservation in Mr Penumbra’s 24-hour bookstore

In Mr Penumbra’s 24 hour Book Store, there is a belief that the ‘Codex Vitae’ holds the key to immortality, thus it is being preserved. This shows how the book’s influence and effect is a key value in determining preservation. If the book wasn’t so significant, would they still be trying to crack the code and preserving it? There is an air of selfishness in the sense that there are a select few who know the truth behind the book and that it would hold the key to immortality. It is a symbol of power. On the other hand, the age of the book also makes it desirable. Google wish to digitize it and preserve it because it is their goal with everything. The character Kat expresses this desire of omnipotent knowledge.

How do these corporations operate? – The Festina Lente Company gains their money off the copyrighted fonts. They are much smaller than Google and have more personal reasons of preservation, such as the numerous ‘Codex Vitae’s’ of the members of the Unbroken Spine. Google’s money from advertisement drives their projects of preservation and digitization. They wish to preserve life through keeping these works of literature. Google’s scope and audience reflect their aims of preservation. Google use [Clay uses] the ‘Grumble Gear’ scanner to digitally preserve these ‘Codex Vitae’ [for his team and Google]. The method is much more contemporary, and much more technologically supported than the book chase and decoding the Unbroken Spine makes their members do.

Google’s immortality is symbolized in the significant use of their search engines, and people’s subconscious continuous use of their website [webserevices and search engine]. It is the similar immortality that is hidden in the font ‘Gerritszoon.’ It is so obvious that it is hard to realize. The preservation value of the book lies in its significance for others, be it personal or commercial. Alan Liu describes how a symbiosis of both Google’s methods and the FLC’s methods of preservation would be ultimately the most affective. Liu states ‘it may be that experiencing and communicating literature through social-computing technologies will do more than supplement older reading, interpreting, and performing practices. The payoff will be an evolution in our understanding of the nature of reading, interpreting, and performing.’

Here, technology and conventional methods work together to improve the preserved piece, through its interpretation and understanding.


1813 – Second edition of Sense and Sensibility – Jane Austen.

88406-335x352Image courtesy of Peter Harrington – London

2011 published edition of Sense and Sensibility – Jane Austen.



Image courtesy of AustenProse.

Breaking the “Code” by Scanning the Text?

By Gareth Williams and Santino Prinzi


One of the most exciting scenes in Mr Penumbra’s 24 Hour Bookstore by Robin Sloan is when Clay has snuck into the Feste Lente Company (FLC) with the Grumble Gear Book Scanner so he can scan Manutius’ codex vitae, which the FLC are trying to crack. This is so he can use a computer to read the text on his behalf in the hope of discovering the meaning of immortality. Clay also scans Ajax Penumbra’s codex vitae as he fears the FLC will destroy it if they find out what they have done (which they do). By scanning the codex vitae Clay remediates the physical printed text into PDF images, which we can do ourselves, but there’s more to it than that. The images are transformed into plain text by using Optical Character Recognition (OCR) software to change the image into readable, workable text, but this doesn’t always work.

Although the characters use a rudimentary cardboard system in the novel, the process of scanning texts has become incredibly popular.In the same way that FLC use the codex vitae to preserve life, many historians and archivists are turning to OCR in order to save texts that could easily be lost. However, this isn’t an easy process as first the book must be scanned to a JPEG or PDF and then encoded using a bitmap system. Obviously the more degraded the original copy, the harder it is to get a clear, legible transcript. Many OCR functions are often described as ‘brittle’ for this reason; errors created in the early stages of encoding are quite likely to end up in the final product. Here are some examples;

Images courtesy of HathiTrust. Images of the original text available at:

As you can see, although a lot of the text has been recognised easily, some of the letters were unrecognised by the software (such as the ‘E’ being too condensed that it is seen as an ‘x’). The main problems arise due to the use of different type-faces and smaller fonts.

Images courtesy of HathiTrust. Image of original text available at:

As you can see from the second example there are times when the software almost completely fails. It not only struggles to recognise words in italics, but it also sees the smaller fonts as symbols. It is often occurs that some words appear almost legible even though they have collided together. This is demonstrated in the plague manuscript – “Infection of the Plague seldom, if ever, …” is encoded to, “Ilffctctiofl’ of the Plagae ſeldonyiflevjer”. Errors are unavoidable in any computer system, but when scanning older texts it is apparent that they are incredibly frequent.

This example demonstrates a key message in Sloan’s novel about the use of technology: embrace it, but don’t rely on it solely. There are distinct differences between the PDF file and the plain text, and the plain text would require substantial physical editing in order to correct what the OCR software couldn’t do for us. Failures like this do not mean we shouldn’t be using computers to aid us, just like the failures in Sloan’s novel do not stop Clay from trying to break the code of the codex vitae. We’re looking forward to any future failures (and successes, hopefully) as we experiment with OCR software and other digital tools on this module.