monkeyflash.com
November 5th, 2006 | Posted in ColdFusion, Tutorials
Recently I needed to use Verity’s Spider (Vspider) to crawl my dynamic website to create a collection that could then be searched. I was able to find useful information here and there but not just one place that had all the answers. I’m not going to pretend to have all the answers, but hopefully you will find this beginning-to-end tutorial of making a dynamic site searchable using Vspider helpful.
The last several versions of ColdFusion have shipped with a 3rd-party indexing engine called Verity. Creating standard Verity collections is covered in most ColdFusion books in great detail. The standard indexing is great if you have static .html or .cfm pages but what if you have pages that are mostly database driven?
That’s where Vspider comes in. Vspider is a crawling tool that will go out on your website and index your dynamic pages. It’s not perfect, but it does come free with ColdFusion for local site use. Similar products can cost thousands of dollars, so Vspider is worth a look.
Step 1 – To use Vspider locally your site needs to resolve using the localhost domain. To test this, just open a browser on your server and type in http://localhost/. If your home page shows up then you are good to go. If it doesn’t you will need to change your IIS settings to point the localhost domain to your site. It is also fine for your home page to be in a subdirectory like http://localhost/mysite/.
Step 2 – Choose a directory on your server to store a text file that you will call from the command prompt. I choose the verity directory under my CF installation directory.
Next create a text document using Notepad to contain all of your command calls. Using a text document is much easier that typing the same thing over and over into the command prompt.
You can reference the Adobe livedocs for all of the Vspider commands but your document should look something like this:
-style c:\cfusionmx\verity\data\stylesets\coldfusionvspider\ -collection c:\cfusionmx\verity\collections\newsite -exclude http://localhost/newsite/_mmServerScripts/* -exclude http://localhost/newsite/_notes/* -exclude http://localhost/newsite/admin/* -start http://localhost/newsite/index.cfm -cgiok
I saved this text file at myindex.txt.
Step 3 – Once the file is created you are ready to index your pages. First open a command prompt window by selecting Start -> Run, enter cmd, and click Run. I normally change my directory to the c: directory by entering cd C:\. Next you use the cmdfile command to call the Vspider command file that we created earlier. The call looks like this:
vspider -cmdfile c:\cfusionmx\verity\myindex.txt

Text file and Command Prompt code.
Step 4 – Open CF administrator and browse to the Verity Collections section. Create a collection by filling out the form under Add New Verity Collections. The name of your collection should be the same as the collection you created during indexing. For this example it is newsite. Choose the Language setting Englishx as your language because this is the language that Vspider uses. Under Path you will want to enter the directory location where your collection was stored during indexing. For this example it is c:\cfusionmx\verity\collections.

Settings as they appear in CF Administrator.
Step 5 – Once you’ve entered the collection into CF Administrator you should see the collection in the list under Verity Collections. The number of documents in the collection should match the number indexed by Vspider. At this point the collection has been created and mapped and is ready for use.
Now it’s time to create a search form to test your collection. The following form will do the trick.
<form method="post" action="search_action.cfm"> <input name="criteria" size="30" maxlength="50" type="text" /> <input value="Search" type="submit" /> </form>
Place this form in a CFM file and save it at search.cfm.
You’ll also need an action page to process your search form. This page will include a cfsearch call and a cfoutput to display the results.
<cfsearch collection="newsite" name="GetResults" criteria="#Form.criteria#" suggestions="1" status="info"> <cfset count="1"> <cfoutput query="GetResults"></cfoutput></cfset></cfsearch> <dl class="searchresults"> <!--- Remove the http://localhost from the URL ---> <dt>#count#. <a href="#replace(URL,">#Title#</a></dt> <dd>#Context#</dd> </dl> <cfset count="count"> </cfset>
If this search page will reside under the same domain as your site then the above example should work just fine. If not then you can replace http://localhost with your site domain.
Save the action page as search_action.cfm in the same directory as search.cfm. View search.cfm in a browser, and use the form to test your collection.
The trick for me in this process was learning that you must index your website using Vspider BEFORE you create the collection in CF Administrator. If you create the collection in CF Administrator first, Vspider will return a list of errors during indexing, none of which explain the problem.
-Guest Article by Richard Baldwin