keyword

You’ve probably come here maybe because you clicked on the Keyword link from our search page, or you came across our blog, or from blogs.o.e.  But however you arrived here is secondary to the information that you want.

What is a Keyword, how do I get one?  The first part is easy, a Keyword is a prominent link that is top of results based on, well, keywords.  Google calls it Keymatch, but we are keeping our prior terminology of Keyword.  So for example, a keyword could be “academic calendar” which when you type in search, displays in a shaded area above the result set, that when you click the link takes you to the catalog for the academic calendar.

So how do you get one?  In our previous search engine, we would have to enter keywords into a database manually, and there was no policy on establishing keywords, and as the keywords grew and grew, the maintenance in ensuring links were fresh was too much of an overhead.

With the new Google Search Appliance, we are operating in a different mode, we have data to look at, and with good search engine optimization for your pages, organic results should be improved.

It was previously necessary for keywords for many users, but with a better organic result set, we can minimize the number of keywords we have to maintain.  With the ability now to see what are the top queries that both get and don’t get results, we can make some intelligent determination on what should be keywords.  For others, we do recommend that you optimize your page for search engines, and there is information about it on Google’s site.  The basics on it though if you don’t want to read all about it is, one, relevant content, and two, other sites to link to your site.

We’ll be looking at other approaches in the future to build upon the need for additional promotion, but for now, a data-driven approach is what we will be looking at for a fresh approach to searching by keywords.  So for now, we will not be taking user requests for keywords.  Stay tuned to this blog for changes.  In the future we hope to make the search reports accessible via a web interface that any user can visit.  If you have feedback, please contact us, or leave a comment here.

Advanced Search

With the release of the Advanced Search function, we have released Search as production.  Hmm, what does that mean?  Well, now if you go to the original search site, at search.oregonstate.edu, it is the new Google Search Appliance search.  So does that mean that’s it you ask?  Well, no.  There’s still more features coming, like Narrow Your Search and Keywords, which we’ll discuss more about in the near future.  For now, look at the advanced search.  If you really are looking to search, and you don’t find what you are looking for, don’t give up.  Try the advanced search feature, add more words, exclude words, pick a specific file format.  There are several file formats you can look to find.

Advanced Search

Now if you don’t find what you are looking for, it could be that the site you are looking to search might be new and not crawled yet, or it might not be hosted with us, so we are not aware it needs to be crawled.  In this case, all you have to do is just let us know what the site is, and we’ll look to crawl.  Read our other post on exceptions to get additional information as well on what we don’t crawl.

iPhoneicon Now on OSU Mobile, in v0.2 of our Alpha release, we have People live search.  Simply navigate to m.oregonstate.edu on your mobile phone, select people, and start typing in a name, email or phone number, and for select phones, results will begin to appear with the live search feature in v0.2.

People Lookup
People Lookup

You probably noticed just then that we said select phones, right?  Well, it’s true, the live search feature is only for phones capable of handling the dynamic querying.  For now, this means all the iPhone, Android, and Palm-pre users.  What about all of the people on Blackberry you ask?  Well, some functionality exists right now, but we’ll be working on Blackberry phone support in the near future.  For some Blackberry phones on newer versions, the live search may be functional, other Blackberry OS versions will just be able to get a normal search, where you type in the name and hit the enter key to get a result.  Because Blackberry phones handle aspects differently, there may be slight differences in feature capabilities.  For all other non-smart phones, we will also be looking at what the right strategy is to deliver information to these type of phones.  But don’t hesitate to let us know that you want to have some capabilities for your phone as well.  Leave us a comment here and tell us what you would like to see.

One of the other things added is a Library menu item to help make it easier to navigate to the Library’s mobile site.  So that’s it for v0.2 right now, for all you people on the go.

How to use the Live Search:  Start typing a name, first or last, email address, or phone number, and a set of results will be returned to you as you start typing.  The more you type, the narrower your list becomes.  For this release we only list the first 20 results, so the narrower your search, the better.

While we hope to have flushed out as many bugs as possible, it is always possible there may be one or two you will encounter.  If you notice any, please send us a note via our help ticket form.

This feature release brought to you by the people of Central Web Services in collaboration with Enterprise Computing, the Valley Library, Web Communications and others.

The first thing to remember in Navigating OSU Mobile is not while driving a vehicle.  There’s a law in Oregon that only permits hands-free talking while driving, for those age 18 and over.  If you are navigating OSU Mobile, you are definitely not hands-free.  Read up on the law if you are not aware of it.

So, you’re not driving?  Well then don’t forget to look up and know where you are if you are walking and navigating, so you don’t run into a pole or a trash can or in front of a vehicle or into another person.

Aside from that, Navigating OSU Mobile is relatively simple in the current version.  The front page has the OSU logo, and underneath the icons and text for various menu items.  Each menu item takes you to a page in OSU Mobile that you can then further navigate, or that will provide information, links and the footer to select the full OSU website.

The second level page, for example, when you click on Buildings, changes the top header, to include an arrow tail Mobile OSU logo, that when clicked will return you to the main page.  After that the title of the second level page is shown, and underneath the same style of navigation as the main page or other information.  Going to subsequent depths of navigation, will display the arrow tail Mobile OSU logo, the previous navigation page, which when clicked will take you back to the page listed, and the title of the current page, in addition to further navigation or information underneath the header nav bar.

Stylistically, a different approach was taken with the Alpha version of OSU Mobile to indicate how to navigate with the arrows being within the design instead of the traditional button style.

mobileheader

In the image shown here from the Finals Schedules, OSU takes you back to the OSU Mobile Home, and Subjects takes you to a list of available courses.  Art is the title of the current page and is not clickable, and underneath is either information or additional navigation paths.

And that’s basically it for navigating mobile.  Simple?  We hope so.

osumobile

Have you heard or seen?  OSU has launched the new mobile site, m.oregonstate.edu.

The official launch date was on December 7th, 2009, and there was some information in The Daily Barometer and on OSU Today about this, if you want to read what was said then.  Or you can find more information here as we continue with its development.  This project was a collaborative between Central Web Services providing the core mobile development efforts, Facilities, the Office of the Registrar, and Web Communications.

What you missed if you did not see it during the launch week was the display of finals schedules, thanks to the information provided by the Office of the Registrar.  Not to worry though, there’s plenty more finals to come along in 2010.

So why mobile, and why now, or really why back in December?  The reason is simply the prevalence of mobile computing in the world today.  Phones that weren’t smart are becoming smarter, and the smart phones, well they are just adding more and more.  There are plenty of studies on how mobile computing is where people will find their main source of information in the future, so I’m not going to rehash that here.  Just use your favorite search engine and go read about it.  So why now?  Now because we cannot afford to wait and be left behind other Universities that are advancing on this front.  The reality is, trying to find a building isn’t something you should have to pull your laptop out of your backpack, especially when it is raining, or go find a computer lab and a computer to log on, to find a building.  Much of this information can be provided via a mobile device, and for certain phones we have in fact provided this.

What we dubbed the alpha release, which is a working release with a few main features, buildings, parking, the full campus map, and some essential phone numbers are provided via m.  There’s still some things to work out to provide in future releases, such as providing location information, and other detailed information, but the alpha release was just that, alpha.  Showing this now, provides the OSU community the ability to see some of the capabilities that mobile can provide, and let’s us ask the question, what’s next?  Well, how about a directory search?  Or News or personalized class schedules or finals?  These things are all possible.  No doubt everyone has seen news releases about all the numerous schools with iPhone applications, right?  Stanford, Duke, the University of Texas and others have released or are releasing applications specifically for the iPhone.  Did you know there is a team working on an OSU iPhone application as well?  That’s right, there is a Capstone Project that is working on just this.  So then you ask, why another mobile platform?  The simple answer is that not everyone is walking around with an iPhone.  If you’ve acquired a Palm Pre, or an Android, or a Blackberry, what do you do?  Well, that’s why we have m.oregonstate.edu.  Currently, Palm Pre and Android phones will be supported (ie. you should be able to see and use the site).  The newer Blackberries also will work for most features.  With some of the older blackberry models, the UI will not be correct.  We’ll be laying out a list of supported models soon, and we’ll provide a blog post on it soon.  Unfortunately we won’t be able to support every single phone, but we figure the ones we do will be a good subset.  And did I mention m.oregonstate.edu will also work on the iPhone?

There is plenty left to do, but the way I see it, we’ve only just begun.

Thank You,

Jos Manuel Accapadi

Interim Associate Director, Central Web Services

So, we’ve been talking about search, and people no doubt wonder if their site will be found with OSU’s Google Search.

Most cases, the answer is yes, but in some cases the answer is no.  For the no’s there are reasons why and is what I want to talk about in this post.

The Exceptions

Why are there exceptions?  There are exceptions for a few reasons.  First is the license limit on our Appliance, which is currently one million documents.  549,998 is our current document amounts and we are still indexing sites as we are made aware.  So if a site has a large number of documents, for example a site that has an individual page show up for a dictionary, where each entry of the dictionary is considered a document, then that will eat up the million document limit fairly quickly.  Relating to the previous example, some exceptions are because of the applications users or departments use.  For example, currently a Joomla CMS results in a large number of documents returned because of the way the application works.   Second, if there are sites that are not maintained, which get hacked or spammed, we don’t want to index sites that have spam inserted into it which may likely show up in the search descriptions.  Third, if crawling the site results in an endless loop, where documents in the site refer to itself, so the crawler basically gets stuck, don’t crawl those.  Fourth, if a site returns a large number of errors, then there is something wrong with the site and that is consuming the Appliance resources, such as CPU and memory.  Fifth encompasses all these aspects, which is the administration overhead.  With all the other functions CWS supports, if a particular search aspect would result in significant administration overhead, we would need to make the best decision to minimize that overhead.

So what are our current exceptions?

1.  ONID home directories are not searched.  Why?  Mostly because some users do not maintain their sites, and the sites result in spam entries, and across twenty thousand or more, it’s too much of an overhead to manage.  A policy decision was made for this.
2.  http://ecampus.oregonstate.edu/ask-ecampus/knowledge-base/  Why? This site returned over 250 thousand documents.
3.  http://www.cof.orst.edu/org/iawa/  Why?  This site returned over 160 thousand documents.
4.  http://oregonstate.edu/tac/index.php?option=  Why?  This site returned over 600 thousand documents (due to the way the application handles pages)
5.  Group sites at http://oregonstate.edu/groups/ Why?  This is for the same reason as #1.  As part of the move to people.oregonstate.edu for group sites, we will be reevaluating this.
6.  http://oregonstate.edu/webprojects/wiki Why?  has 2 million errors
7.  http://www.familybusinessonline.org/index.php? Why?  This site returned over 400 thousand documents (due to the way the application handles pages).
8.  http://oregonstate.edu/cla/anthropology/gallery/kingston/main.php? Why?  This site was caught in a loop.
9.  http://bioe.oregonstate.edu/reservations/ Why?  This site was caught in a loop.
10.  http://oregonstate.edu/aepcore/index? Why?  This site was caught in a loop.
11.  http://hort.oregonstate.edu/event/  Why?  This site was caught in a loop.
12.  http://recycle.oregonstate.edu/EarthDay/eventCalendar.cfm?  Why?  This site was caught in a loop.
13.  http://extension.oregonstate.edu/clackamas/announcement/  Why?  This site was caught in a loop.
14.  http://physics.oregonstate.edu/event/  Why?  Events list returning excessive results.
15.  regexp:http://www\\.osualum\\.com/?.*cid=[0-9]+.*?  This is a regular expression statement that if it has the url form specified then it is not being crawled.  Why?  This site was caught in a loop.
16.  http://oregonstate.edu/sli/aggregator/announcement/  Why?  This site was caught in a loop.
17.  regexp:http://oregonstate\\.edu/womenscenter/library.*browse=*  This is a regular expression statement that if the url has the aspects specified within it, then it is not being crawled.  Why?  This site was caught in a loop.

If your site is on this list, and you want to discuss this, then contact us.  We do want to reevaluate sites periodically

We also do not index every type of file extension.  Image files, media files, archive or binary files are not crawled.  There would just be way too many that would exceed our license.

So those are the exceptions are reasons why.  We don’t necessarily expect everyone to be happy or agree with the exceptions made, however, we have to make the best decisions to support OSU as a whole and keep in mind the limitations of our search engine.  However, stating that, we do want to periodically review our decisions, and also determine if alternative solutions can be implemented.  So if there is a concern, then please contact us.

So Search Beta was released in conjunction with the new top hat design for OSU (another change as part of future upcoming changes).  A great effort between Central Web Services (otherwise known as CWS) and Web Communications.  The same collaborative group that introduced OSU Mobile.  Don’t know about OSU Mobile?  Well for that, visit m.oregonstate.edu (iPhone, Palm Pre, Android and some Blackberry), and I’m sure we’ll be talking about that in other OSU CWS blog posts, so stay tuned.

So what is Search Beta?  It’s just that, it’s really Beta.  We are transitioning information, crawls, features from the Google Search Appliance to the User Interface for search.  It’s not perfect, not everything will be found right now.

So you might be wondering about how that affects search on your site pages, which uses the central code provided by CWS.  Because we have a front end to search, we are able to make it as transparent as possible to site owners.  The goal is sites shouldn’t need to be modified, if they use the search module integration CWS has provided and made available previously.  Integration with Drupal sites will be upcoming, so if your site is not showing results because it has not been indexed, do not worry, we’ll be rolling the Drupal change in soon.  After that there are a couple things that need to happen.  First if you are running what we call a virtual host, like hmsc.oregonstate.edu or in a path in oregonstate.edu/, the Google search has to find your site possibly linked from other sites and index the site.  This is the engine part of the appliance, and Google does a fairly good job with this.  The process could take anywhere from a few hours to a few days, depending on the algorithms Google uses to find new pages.  Anything in the oregonstate.edu/ area is continuously crawled.  There are exceptions (and reasons for exceptions), which we’ll be noting in the days to come and which we’ll talk more about in another post.   Second, if your site is not found after a reasonable amount of time, then we can look at explicitly crawling your site.  This is more common with virtual hosts.  If that is the case just contact us using our online contact form, but first read the next post about exceptions to sites being crawled.

We’ll also be looking to get some input from users.  You can comment here, or you can comment on the Web Communications blog, where there will also be information about the new home page that will be introduced this year.  In addition to commenting, there will be some focus groups, which is another avenue to provide feedback.  The focus groups will look at search among other things.

So when it comes to the OUS Search, we say Search Me.

Search Introduction

The OSU search tool has been updated to provide a better long-term web search solution for Oregon State University.

The purpose of the OSU Search Category in the CWS blog is to examine the evolution of OSU’s web search, and offer more detailed information about OSU’s current search capabilities and keep OSU current on the happenings with OSU Search.

Background

History

OSU has seen three search solutions through the course of its web history.

Inktomi – 1998-2002
Inktomi was OSU’s first search engine. Inktomi’s base technology was initially developed at Berkeley, and during the mid-to-late 90’s became the driving force behind the Yahoo and HotBot search engines.

Google – 2002-2004
Google originated at Stanford university as project BackRub, named for its weighting of backlinks in its search algorithm. In a few years, it developed into the most popular search engine in the world. As a natural expansion to the search engine, Google developed standalone search appliances aimed towards large organizations with a substantial web presence. Google Search Appliances provide a solution-in-a-box for searching large intranets and offering more specific content filtering than is possible with google.com’s web interface. One of these appliances powered the OSU search for almost two years.

Nutch – 2004-2009
In August of 2004, at the end of the Google contract, Central Web Services evaluated Nutch as a replacement search service. Installed on OSU hardware, running software built, configured, and maintained through ardent cooperation between CWS and Nutch programmers at the beginning of Nutch’s history with OSU, Nutch powered the search.oregonstate.edu service for many years. During this period Google advanced many features of their appliance and search capabilities. Because of the advanced capabilities of Google and the overhead to deal with the issues existing in the nutch release, the decision was made to sunset nutch for OSU and return to a much improved Google Search Appliance.

Google Search Appliance – 2010-

On January 1st, 2010, Central Web Services unveiled the new Google Search Appliance.

Why Change?

The migration to Nutch in 2004 was initiated to improve flexibility and extensibility, and as an open source product access to the code was available. As other advancements occurred in technology, there was not adequate time or personnel to be able to focus on code changes to have the nutch search engine reach a stable state and meet the growing needs of OSU. In October of 2009, the decision was made to let the search experts take care of search, while the administration and the front end design and other enhancements to search management would be maintained by the good people of Central Web Services.

Support

Contact Central Web Services with questions, comments or concerns.