Friday, September 12, 2008

Third Week's Readings

I'm going to talk about three of this week's readings, namely "Identifiers and Their Role in Networked Information Applications" by Clifford Lynch; Chapter 9 from Digital Libraries by Arms; and Lesk's chapters 2.1, 2.2, and 2.7.

All of these readings work together to complete a bigger picture of the ways of the internet and how URLS, URNS, and other aspects of digital life come together to form what can be key aspects of a digital library.

Lynch's article:
  • ISBNs and ISSNs = identification standards that relate to the standards of URLs and URNs
  • Difference between URL and URN: URL points to location, URN works to identify the page by name instead of location.
  • DOIs = the handle system; allows access to those who subscribe (important for digital libraries that are not open); can help with citations, but there are still kinks to be worked out
Arms chapter 9:
  • All about text. Text is what makes the digital world go round.
  • Text can be used for management, organization, display
  • Conversion -- *this may be helpful if we need to scan things to include in our digital library! Conversion allows a scanned image or page to be searched. Need one (or more than one) character recognition programs.
  • ASCII and Unicode
  • Unicode supports and can represent many different languages - making it easier to transliterate these languages.
  • HTML and XML -- language and code.
  • XML uses 16-bit Unicode
  • CSS and XSL: Cascading Style Sheets and Extensible Style Language: CSS works with HTML mark-up and XSL works with XML mark-up. Currently XSL is not as important.
  • TeX: a page description language that began first. Deal with typesetting. Best for math journals.
  • PostScript: originated for graphics
  • PDF: (Portable Document Format). Came from PostScript and works best because it is legible on screen and on the printed page.
Lesk's Chapters 2.1, 2.2, and 2.7
  • Typesetting. Lesk goes through the history of typesetting from paper to online.
  • Text Formats. Reinforces what I learned from Arms about ASCII and Unicode. He discusses MARC and SGML, as well as HTML. All of this is important in the display and the retrieval of documents!
  • Keying vs. Scanning- does anyone really key anymore??
  • Scanning is cheaper and can produce high quality documents and images, but that also depends on the state of the original document and the scanning technology being used.
  • Can poorly scanned documents fall victim to poor retrieval methods for keywords and images?

Muddiest Points for Week 2
What kinds of things should we know from the readings for the midterm?

No comments: