How Are Search Returns Pulled?
How are search returns determined when someone visits your blog and types in a search term using the sidebar search widget?
Are article titles given priority over article content?
Here’s an example from my blog: Type in “Boob Jobs” with the quotes and an article with that exact title will be returned; then do the same search using only — Boob Jobs — without the quotes, and the article with that name will not be returned.
Now let me give you a less provocative example.
Let’s say I want to search for artist “Claes Oldenburg” and I know I have an article titled “Claes Oldenburg’s Torn Notebook.”
We’ll leave out encasing quotes in these examples even though I’ll use them here so you know what I typed into the search box:
“Notebook” — I do not get the right article.
“Claes” — I do not get the right article.
“Cornhusker” — I get the article returned!
These search results are not randomized: I get the same articles returned if I do multiple search instances of the same term.
I think this thread might give some answers:
Hi nosysnoop —
Thanks for the link, but I’m not talking about truncated search returns.
I’m asking about how posts and articles are indexed by WordPress.com and then what weight — what algorithm — is used to decide which posts are presented based upon the keyword search criteria entered.
The issue may seem esoteric, but it is important to know how people search to find content on your blog using the sidebar widget.
It seems article content is given more weight than the title of an article but I’m not certain that’s the case in all instances and I’m asking and wondering why.
If you go to my site and enter the search examples I’ve provided you can see the issue I’m discussing.
IMO, although I could be wrong this is not something that is likely to be discussed on the forum. I would be inclined to send a feedback or an email to support at wordpress dot com.
I don’t understand, timethief, why this is an inappropriate topic for discussion in the support forum.
I meant that I don’t know, drmike will be away for a few days and staff are busy. I thought this is more like something that would be explained in an email to you by staff with a description posted later in a pink sticky on the forum but I could be wrong. Either way when staff have the time they do read forum posts so they will know you have asked. :)
Okay, cool, timethief, I thank you for your insight and your good thoughts.
I don’t want to bother support for this because it isn’t time-critical, but from an Author point-of-view it is sort of important to know how our stuff here is indexed and spat back on what terms.
On the standalone version of my WordPress blog I seemed to get more robust and accurate search returns than I’m getting here on the Dot Com hosted version and I’m curious as to the why of it.
Believe me as a former librarian I am a blogger who has sorely lamented the fact that a universal standard like library of congress subject headings and anglo american cataloguing (indexing) rules are not in use.
I continue to struggle with assigning categories and tags every day. And I frequently come close to banging my head on my computer screen when I cannot find something using the search box.
I am more than curious and you can bet rain(coaster) is reading every word too. Please do share whatever you learn. ;) I’m sure all of us would love to understand how the system works and how to use it effectively.
raincoaster is indeed reading every word. At first the searchbox on my blog wasn’t very accurate, but I have found as time goes on that it seems to “learn” and now returns exactly what I’m looking for on a fairly regular basis. Of course, it could just be that it has me trained now…
timethief — Thanks! I’m with you on the struggle for relevance and “ease of finding stuff.” As blogs grow to 1,000 entries and beyond, we need a reliable and predictable way to carve into those articles via searches by getting right returns. It would be so great if one day comments were searchable as well but with sites like mine with nearly 18,000 comments, I realize that want may not soon be met.
raincoaster — Have you been able to quantify this “learned” search process or are you just being anecdotal? Can you provide search examples of this learning?
unfortunately, yes, you’ll have to send in feedback, because the search function has a long history of being inadequate, and it’s something only the admins can fix. you can try using the default theme, or k2-lite, which have a builtin live search, which is supposedly better.
The K2 Lite search is superior to the search on the other ones I’ve used before. But I’m going nuts with 404’s to titles that actually exist on my blog. It’s making me grind my teeth and I don’t have a dental plan :D. A lot of my readers are oldsters who are new to computers. When something 404’s they don’t know what to do and are intimidated by the message to click in the sidebar. I babysit them by telelphone and email but the reason I got a blog was to get away from the email, newsletters, list servs and telephone calls.
I consider the search utility to be very important. In my books it’s right up there with categories/tags or whatever they are called today. They are also 404 ing on my blog … arggghhhh! I will send in a feedback later today and I sure boles and rain will too.
sunburntka — I’ve been using K2-Lite since I landed here. Are you sure K2-Lite has built in live search? I know the standalone K2 version has live search but I don’t see that capacity available in the K2-Lite theme on WordPress.com
timethief — I didn’t realize different themes provided different search returns. Aren’t all themes using the sidebar search widget? Why would the returns be different if the same widget is being used?
This definely is the place to discuss this topic. Please continue.
It seems to me that the K2 Lite search box does work better than the ones I experienced on other themes. However what “seems” to be is not neccessarily what “is”. Thus it wouldn’t surprise me if the answer we get is “they are all the same”. I could have simply “learned” the way rain has been “educated” by her searchbox.
boles has very clearly stated what we want to know so where to go from here is the question. IMO at this juncture we need to hear from staff. The two alternatives seem to be (1) we can all send in individual feedbacks in and drive Mark nuts or (2) we can just leave this thread on the forum and bump it up until we get some answers.
I like option (2).
forestneeds — Love your blog! It is unique and funky. Thanks for the support that this arena is the place to discuss the search issue.
timethief — I’m really not ultimately interested in themes and searching because what I’m hearing about that is some themes seem broken when users do a search and that should be reported. I’m more interested in knowing how — when everything is working as expected — the database feeds back search returns based on our queries because knowing how that process orders and prioritizes words and then values them for presentation can influence our post content and our post titles.
I’m more interested in knowing how — when everything is working as expected — the database feeds back search returns based on our queries because knowing how that process orders and prioritizes words and then values them for presentation can influence our post content and our post titles.
Yes, I have understood this all along and I stand with you waiting for a response from staff and/or the developers. :)
Okay, thanks, timethief! I just want to make sure others reading this understand the search issue I’m bringing up is universal and not theme-based.
I’m not a quantifier, generally speaking. All I can say is that for the first three months or so I found the search useless, but that now it returns the answer I’m looking for almost all of the time. I’ve been using WordPress since the end of February this year.
Thanks for sharing your experience, raincoaster! Perhaps the search returns do somehow “learn as they grow” but I’d certainly like to know how that is possible.
The topic ‘How Are Search Returns Pulled?’ is closed to new replies.