Monday, November 22, 2010

Muddiest Point for 11-22 - 11-26

Do you think that web search engines will evolve to the point that they will be able to retrieve results from the deep web on a consistent basis?

Readings for 11-29 - 12-3

Weblogs: Their Use and Application in Science and Technology Libraries
Simply speaking, a weblog (or blog) is a Web site that resembles a personal journal and is updated with entries and postings.  What is really nice about blogs is that the entries/postings are dated, and sometimes are assigned category headings and keywords.  Essentially, a blog is an online equivalent to a paper diary, reading list, newspaper, and address book all in one.  One of the best things about blogging software is the ability to archive entries, which can be searched, browsed, and reviewed at any time in the future.  The very first weblog was created by Tim Berners-Lee when he was working for CERN.  Rebecca Blood laid the foundation for what blogs would become in 1993 by providing links to new Web sites.  The term was coined by Jorn Barger in 1997 on his site, Robot Wisdom.  Eventually, the "blogosphere" arose which describes the large community of webloggers that now number in the millions.  Blogs are closely related to social software, which adhere to three key principles: support for interaction between individuals or groups, support for social feedbacl, and support for social networks.

Using a Wiki to Manage Library Instruction Program
A wiki is a multi-author, collaborative software program that helps people self-publish and share information.  Libraries should strongly consider using this technology in order to create better information sharing, facilitate collaboration in the creation of resources, and divide the worklaods among librarians.  The two chief uses for library instruction wikis include: sharing knowledge and the ability to cooperate in creating resources like guidelines and handouts.  Libraries could create wikis and allow users to participate in the creation of them.  However, libraries should be aware of the risk involved with lettinf users contribute to information.  One way to get around this could be to assign a password in order to get into the wiki.  Nevertheless, the benefits are numerous.  Libraries could create wikis on various professors that include what those professors expect and want in their classes, guidelines on specific assigmnemts within various professors' classes, and updated changes to professors' assignments.  The uses for wikis in libraries could be endless and should definitely be considered.

Creating the Academic Library Folksonomy
Social tagging is a new, but growing, phenomenon that allows individuals to create bookmarks (also known as tags) for Web sites and save them online.  These tags include subject keywords chosen by the user and often a brief description of the site.  Libraries could increase their use by allowing such practices to be included in their institutions.  This could help students, professors, and researchers look for better information when doing various academic tasks.  One of the great advantages of tagging is bringin "gray literature" into play.  A lot of valuable information created by experts and scholars cannot be found easily if students are not connected to the associations or scholarly networks that share this literature.  Tags that are created by curators who have access to this information would allow students to dive into these rich resources that they would have otherwise not been able to access.  On the contrary, one of the great risks of tagging is "spagging," or spam tagging.  Users could create tags to websites that are unsuitable for their own profit or just to cause havoc.  Another problem is the variety of keywords chosen for tags, which could cause much confusion.  Finally, allowing users to contribute to this process would significantly diminish the "gatekeeping" role of librarians.  Are we willing to do this in order to bring more users into play, which would increase the use of the library?

Jimmy Wales and Wikipedia
I found this talk to be very interesting.  I never knew the exact organizational qualities of Wikipedia.  Jimmy Wales did a nice job of explaining the history of the website, and how it is managed.  I found it interesting that Wikipedia is much more factual than many people have said.  When I was in undergraduate school, I was constantly being told how never to use Wikipedia because anybody can edit it and there are thousands of mistakes.  Wales pointed out that only about 18% of editing is done by anonymous users, and even if editing is done, it has to go through a process of determining if the information is correct.  Naturally, with the size of the community Wlaes refers to, there are always going to be problems with certain pages.  The talk about voting for the deletion of pages is an ingenious idea I think.  People can vote whether or not to have certain pages deleted, and based upon new information, pages that are deemed certain for deletion may still have a chance to survive because of this new information.  Wales mentioned many times about the policy of neutrality and Wikipedia.  I appreciate his ardent ideals about being neutral when presenting information, but there is no possible way that any human being can be completely neutral about anything.  Therefore, I see this as one of the downsides to the Wikipedia experiment.  However, this is not Wikipedia's fault, it is just a fact of human nature. 

Monday, November 15, 2010

Muddiest Point for 11-15 - 11-19

Since technology is always advancing and the general public is pushing for more digital and on-line information, does this mean the traditional library is in danger of becoming extinct?  Will digital libraries and repositories take over the role that traditional libraries have provided for hundreds of years?

Readings for 11-22 - 11-26

Web Search Engine: Parts I & II
These articles provided a nice summary of the basic functions and set-ups of various web search engines.  The first article discussed how web search engines indexed certain types of information.  I found it both fascinating and discouraging that millions of pieces of information are constantly being put on the web and indexed.  The GYM engines (Google, Yahoo, and Microsoft) are indexing pieces of information at a thousand times the rate at which they used too.  The discussion about crawling and crawlers was useful as well.  Crawlers save a lot of time because they eliminate duplicate resources, which is very much appreciated in a field that is always pressed for time in terms of doing things.  Part II of the article discussed various algorithms and methods that are used for search queries.  The most important thing I gathered about queries was the "clever assigment of document numbers" section.  Instead of numbering documents arbitrarily, the indexer can number them to reflect their decreasing query score.  This achieves effective postings compression, skipping (skipping certain words), and early termination.

Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting
 The Open Archives Initiative's basic mission is to "develop and promote interoperability standards that aim to facilitate the efficient dissemination of content."  However, this initiative has spread to a wide variety of other communities who were looking to provide access and information about their respective interests.  Three examples were used to show the diversity of this initiative.  The AmericanSouth.org project, the UIUC project, and the OAIster project were all involved with preserving and organizing important information about hose particular organization's interests.  Even though this initiative has given many benefits to various organizations, there are some challenges that must be tackled.  These include the varieties of metadata, the different formats of metadata, and communication problems within the initiative.  I would suggest that standars should be mandated by these organizations that will help counter these problems.  I would use the Dublin Core Metadata standards to correct the metadata problems, and I would suggest coming up with a definitive vocabulary for the initiative so confusion can be minimized for future providers and users.

The Deep Web: Surfacing Hidden Value
This was, by far, one of the most surprising articles I have read since coming to the MLIS program.  This article basically explained the main differences between surface web sites and deep web sites.  Some of the statistical information this author points out is mind-boggling.  For example, deep web documents are 27% smaller than surface web documents; deep web sites receive about half as much monthly traffic as surface sites; deep web sites are more highly linked to other sites than surface sites; 97.4% of deep web sites are publicly available (which was a surprise to me); and finally, the deep web is about 500 times larger than the surface web.  There is also a great diversity of topics covered in the deep web from agriculture to humanities to shopping.  The thing that surprised me the most was that the deep web has an overall higher quality rating (satisfaction) than the surface web by 7.6%!  All of this information leads me to believe that there is a high quality amount of information in the deep web that is not accessed.  Maybe information professionals should be pushing harder to "surface" some of this information.  Also, there should be a greater effort to educate the public about this topic.

Monday, November 8, 2010

Readings for 11-15 - 11-19

Digital Libraries: Challenges and Influential Work
Being a history major in undergrad, I always find it nice to have some background historical information on subjects.  This article was very helpful in discussing the background history of digital libraries.  Basically, the first initiative to look into digital libraries occurred in 1994 with the Digital Libraries Initiative (DLI-1), which was sponsored by the National Science Foundation (NSF.)  From 1994 to 1999, 68 million dollars was given in the form of research grants to explore this new phenomenon.  The most important aspect of digital libraries and their founding was the joint efforts of combining computer scientists and librarians.  It was a process that saw many challenges, but there were also great benefits.  For librarians and libraries in general, it brought the unique expertise of computer science into the field.  This was extremely important because it allowed librarians to keep doing their work while society continued to push for digital technologies.  It was also an opportunity for computer scientists to develop new systems (which satiated their desire for creativity), as well as providing for the public user sector.

Dewey Meets Turing: Librarians, Computer Scientists, and the Digital Libraries Initiative
This was another useful article in explaining the early works of computer scientists and librarians with regards to digital libraries.  The article explained some of the reasons for why librarians and computer scientists started working together on digital libraries.  The librarians obviously needed a technology face-lift in order to continue providing for the user community.  It was necessary that they started looking into digital technologies in order to stay relevant in the community.  Also, librarians saw this as a wonderful opportunity to bring in much needed funding for their programs.  Computer scientists were excited because it presented them with the opportunity to create new things that would serve the public good.  This, in turn, could bolster their credentials, which could lead to tenure and other opportunities.  Although they ran into bickering problems, the core function of librarianship remains with the advent of digital technologies.  Information must still be organized, collated, and presented to the public for use.  Therefore, librarianship has not changed, just evolved into the new digital age.

Association of Researhc Libraries: Institutional Repositories
This article was very useful in explaining the role and positive benefits of institutional repositiories.  The author basically argues that every higher-learning institution should look into institutional repositories for the benefits of its students, faculty, etc.  Some of the benefits to institutional repositories include:  housing works of faculty, housing works of students, including documentation of institution works and decisions, etc.  Another key component of institutional repositories is to preserve the works of the institution and its members.  By oeganizing and saving the information of an institution, it will be easier to access and research information about the institution in the future.  However, there are three dangers the author points out about institutional repositories: 1. assertion of administration control over author's works, 2. the possible overloading of repositories with policy baggage and political platforms (using repositories to solely counter the publishing industry), and 3. the possibility of repositories being offered hastily without much commitment on the institutions' part.

Tuesday, November 2, 2010

Muddiest Point for 11-1 - 11-5

Since technology is always evolving and changing, will HTML eventually become obsolete?  Or would it just merely die off because other, more advanced programming languages are probably being developed as we speak?

Readings for 11-8 - 11-13

An Introduction to the Extensible Markup Language (XML)
This article did a good job of explaining the basics of XML.  Basically, from what I gather, XML is a subset of the Standard Generalized Markup Languare (SGML) that makes it easier to interchange structured documents over the Internet.  XML files mark where the beginning and end of each parts of an interchanged document occurs.  I thought the discussion on what XML allows users to do was important. (Ex. allows bringing multiple files together to form compound documents, allows the addition of editorial comments to a file, etc.)  I thought one of the most important sections of the article was the discussion of what XML is not.  It is not a predefined set of tags that can be used to markup documents, and it is not a standardized template for producing particular types of documents.  A final note that I would like to touch on is the idea that XML sets out to clearly identify the boundaries of every part of a document, which is different than any other markup languare.

Extending Your Markup: An XML Tutorial
This tutorial was very good at providing the basic fundamentals of XML without getting overly technical.  In this article, XML can  be defined as a semantic language that lets people meaningfully annotate text.  XML documents look a lot like HTML documents, but there are significant differences (Ex. different symbols and headings used for describing things.)  The discussion on DTDs was very helpful.  They basically define the structure of XML documents, and are easy to think of as context-free grammar.  The most interesting part of the article, to me, was the addressing and linking section.  Basically, in HTML documents, URLs only point to a document.  In other words, they do not address specific information within the document.  With XML, one can extend HTML's linking capabilities with three supporting languages: Xlink, XPointer, and XPath.  This "upgrade," if you will, can make finding information within linked documents so much easier.

A Survey of XML Standards: Part 1
Although sometimes confusing, this article did present information about XML standards that were necessary in order to understand the general principles when using XML.  Since this article deals mainly with the standards that are constantly being developed for XML, I would like to discuss the section about the many different standards organizations.  For most standards organizations, the first thing to be addressed is a recommendation, which are suggestions for futher standardization.  These recommendations usually become de facto standards in their own right.  The recommendation becomes a working draft, which then becomes a candidate recommendation.  The final step is for the candidate recommendation to become a proposed recommendation, which usually ensures its status as a full-blown standard after that.  Some of the major standards organizations are the W3C, ISO, OASIS, IETF, and the basic XML community.

XML Schema Tutorial - W3Schools
The W3Schools tutorials are very good because they keep information to a minimum, while giving examples of what they are talking about  in the text.  Generally, an XML schems is an XML-based alternative to a Document Type Definition (DTD.)  XML schemas describe structures of an XML document (just like a DTD.)  The main reasons that XML schemas are better than DTDs is that they are extensible to future additions, they are richer and more powerful than DTDs, they are written in XML already, and they support data types.  As mentioned earlier, the two most compelling reasons to switch to XML schemas over DTDs is because they are more powerful than DTDs and they are already written in XML (DTDs are not.)  The future for XML schemas seems to be bright because on May 2, 2001, the W3C formally recommended that XML schemas should be used.  This means that it will likely become the new standard.