ALA TechSource Logo
 
curve Home spacer Publications spacer Subscribe spacer Blog spacer About  
    

Metadata, Schema.Org, and Getting Your Digital Collection Noticed

Submitted by Patrick Hogan on July 11, 2014 - 11:12am

Editors Note: This post is an excerpt from Improving the Visibility and Use of Digital Repositories Through SEO, by Kenning Arlitsch and Patrick S. OBrien. The authors, along with Montana State colleagues Jason Clark and Scott Young, will be teaching the online course/workshop Search Engine Optimization (SEO) for Libraries, which starts July 17.

Metadata schemas are powerful frameworks for organizing content, and libraries have long used them to describe their holdings (think MARC). Numerous schemas exist for academic disciplines: CDWA is used for art, Darwin Core for biology, EML for ecology, DDI for social sciences, and so on. Dublin Core is probably the most heavily used schema in digital libraries, and it is perfectly adequate for many applications, but the problem with any metadata schema is that most website developers don’t use any at all, and search engines can’t count on the metadata being applied consistently in those that do. The result is that general-purpose search engines like Google tend not to use the metadata even where it is applied appropriately.

Some specialty engines, like Google Scholar, do make extensive use of metadata. Google Scholar, however, wants metadata schemas that can express bibliographic citations specifically and accurately, which Dublin Core does not do very well.

Because search engines crawl the web pages that are generated from databases (rather than crawling the databases themselves), your carefully applied metadata inside the database will not even be seen by search engines unless you write scripts to display the metadata tags and their values in HTML meta tags. It is crucial to understand that any metadata offered to search engines must be recognizable as part of a schema and must be machine-readable, which is to say that the search engine must be able to parse the metadata accurately. For example, if you enter a bibliographic citation into a single metadata field, the search engine probably won’t know how to distinguish the article title from the journal title, or the volume from the issue number. In order for the search engine to read those citations effectively each part of the citation must have its own field. Making sure metadata is machine-readable requires patterns and consistency, which will also prepare it for transformation to other schema. This is far more important than picking any single metadata schema.

Introducing Schema.org

We invest a great deal of time and money creating digital collections, and we usually create web pages that describe the collection’s purpose, what it contains, its contributors, and so on, to give visitors some context they can use to understand the collection. We also take great pains in creating metadata that describe each object in the collection to give it meaning and allow users to reference or discuss the item. While humans can understand and associate the concepts they read, search engines have a very limited capacity for interpreting the meaning of the information we so painstakingly provide.

To help search engines understand the context and meaning of our digital objects we must provide structure to our content using additional tags in our HTML. These tags will say to search engines directly, for example, “this information describes a specific digital object as a scholarly paper, written by an author who works at an academic institution, published by an organization on a certain date.” Sounds easy enough, but communicating with a machine requires an up-front agreement on the specific language and precise vocabulary being used to communicate. The word “bloody” has very different meanings to a person raised in the United States and a person raised in the United Kingdom. Search engines do not understand the regional variations, sarcasm, humor, hand gestures, facial expressions, body language, tone of voice, inflection, and so on that humans rely on heavily to communicate meaning.

Enter schema.org. In 2011 Google, Bing, Yandex (the largest Russian search engine), and Yahoo! “joined forces to create a common set of schemas for structureddata markup on web pages” with the aim of helping search engines to better understand websites. Originally, schema.org was planned to use only HTML microdata as the mechanism, or language for implementing schema.org structured data vocabularies. But it has also recently added support for RDFa as an alternative “language” that developers using “RDFbased tools and Linked Data” can use to implement the schema.org vocabulary.

We think it’s important for repository managers (and especially catalogers) to be aware of these developments because they hold great promise for fulfilling the potential of the semantic web. Sites that already offer microdata provide a great benefit to Google’s users through its “rich snippets,” which display additional details about web pages in the search results. Another example of Google’s use of microdata appears in its “recipe search,” where metadata about recipes provide a faceted navigational search. If Google can do this for recipes, imagine what it could do for library digital repositories that already have rich metadata describing the objects. The bridge that will get that rich metadata to be understood by search engines is the techniques recommended by schema.org, and putting those techniques into place in digital repositories is the responsibility of librarians and archivists.


Comments (14)

Thanks for your nice post . I

Thanks for your nice post . I hope I will see this type of post again.
internet marketing consulting
thanks & regards

SE Software Technologies is

SE Software Technologies is one of the foremost software providers and providing the services of web design & development, consultation, E-commerce solution, Restaurant POS Systems ,SEO/SEM internet marketing, Project Portfolio management, Content Management System (CMS),
CMS Based Website, PHP Based Solutions, WordPress Website, OpenCart E-Commerce Website, Joomla, Magento CMS Website, SEO,
Web Hosting, Logo Design, Medical Shop & Online Pharmacy Store Website.
User interface Design, Mobile Solutions Services, Start -up Business Services,
Health Care Practice Solution, Accounting solution, institute Management solution, Restaurant Solutions Search Engine Optimization, Custom Website Development, Logo Designing,Banner Design,Social Media Content Design, Responsive Website Development, Web Portal Development, all web Services in different countries from last 10 years in UK, USA, Canada, Saudi Arabia, Australia,
Germany and Pakistan.

Check out the link for more details:
http://superconeng.com

Company Name: SE Software Technologies
Phone : +14154187162
URL : www.superconeng.com
Email: info@superconeng.com
Skype : nacseng

SE Software Technologies is

SE Software Technologies is one of the foremost software providers and providing the services of web design & development, consultation, E-commerce solution, Restaurant POS Systems ,SEO/SEM internet marketing, Project Portfolio management, Content Management System (CMS),
CMS Based Website, PHP Based Solutions, WordPress Website, OpenCart E-Commerce Website, Joomla, Magento CMS Website, SEO,
Web Hosting, Logo Design, Medical Shop & Online Pharmacy Store Website.
User interface Design, Mobile Solutions Services, Start -up Business Services,
Health Care Practice Solution, Accounting solution, institute Management solution, Restaurant Solutions Search Engine Optimization, Custom Website Development, Logo Designing,Banner Design,Social Media Content Design, Responsive Website Development, Web Portal Development, all web Services in different countries from last 10 years in UK, USA, Canada, Saudi Arabia, Australia,
Germany and Pakistan.

Check out the link for more details:
http://superconeng.com

Company Name: SE Software Technologies
Phone : +14154187162
URL : www.superconeng.com
Email: info@superconeng.com
Skype : nacseng

very informative, great

very informative, great article bookmarked :)

Nice article have been

Nice article have been published
raksha bandhan messages

The use of digital

The use of digital repositories sound good but the new trend of SEO will kill us all! I see pollution is done by SEO. Lots of bad websites are being ranked on top of google search. But try to be honest.

Great article ...Thanks for

Great article ...Thanks for your great information, the contents are quiet interesting. I will be waiting for your next post.
Bulk SMS Services Hyderabad

Nice Article Have been

Nice Article Have been published.
raksha bandhan messages 
Thank you for the share

SE Software Technologies is

SE Software Technologies is one of the foremost software providers and providing the services of web design & development, consultation, E-commerce solution, Restaurant POS Systems ,SEO/SEM internet marketing, Project Portfolio management, Content Management System (CMS),
CMS Based Website, PHP Based Solutions, WordPress Website, OpenCart E-Commerce Website, Joomla, Magento CMS Website, SEO,
Web Hosting, Logo Design, Medical Shop & Online Pharmacy Store Website.
User interface Design, Mobile Solutions Services, Start -up Business Services,
Health Care Practice Solution, Accounting solution, institute Management solution, Restaurant Solutions Search Engine Optimization, Custom Website Development, Logo Designing,Banner Design,Social Media Content Design, Responsive Website Development, Web Portal Development, all web Services in different countries from last 10 years in UK, USA, Canada, Saudi Arabia, Australia,
Germany and Pakistan.

For price and query contact us now to see how, we can help you.

http://superconeng.com

Company Name: SE Software Technologies
Phone : +14154187162
URL : www.superconeng.com
Email: info@superconeng.com
Skype : nacseng

You have a lovely blog i like

You have a lovely blog i like to visit it again btw i am a blogger too and my site is
Happy raksha bandhan 2014 although it is on different topic keep up like this!

You have a lovely blog i like

You have a lovely blog i like to visit it again btw i am a blogger too and my site is
Happy raksha bandhan 2014 although it is on different topic keep up like this!

I heard about the Schema

I heard about the Schema before but I didn't get a chance to know how Schema works. I'm just getting into ClickMinded SEO Courses and I need a lot of knowledge to digest. I found your blog very informative, I should recommend this to my friends.

Visit For Web Designing and

Visit For Web Designing and tips and tricks of computer

http://itariansworld.com

Der Gegenstand diеser

Der Gegenstand diеser Webseitе ist knoгke, hab es am Anfang nicht glauben wollen.