Introduction
htdig is a webpage search engine licensed under the GNU Public License. It uses a very simple configuration file to allow it to search only the webpages you specify. For example, you can exclude the cgi-bin or a testing directory from the search engine. In addition to installing it on a webserver, some programs use it as a search engine plugin such as Glade, the GTK+ User Interface Builder. In addition, it will create a searchable database of any website. You just supply to URL.
Installing htdig
Configuring htdig
Once you have htdig installed, you must make a few changes to the configuration file and the HTML templates into which the search results are embedded.
HTML Templates
If you don’t want to use the default look-and-feel of htdig, you can edit the following files to use the look-and-feel of your website. The paths may be different if you choose to change the paths of them in your configuration file.
htdig is a webpage search engine licensed under the GNU Public License. It uses a very simple configuration file to allow it to search only the webpages you specify. For example, you can exclude the cgi-bin or a testing directory from the search engine. In addition to installing it on a webserver, some programs use it as a search engine plugin such as Glade, the GTK+ User Interface Builder. In addition, it will create a searchable database of any website. You just supply to URL.
Installing htdig
- Download the latest version from the htdig ftp server.
tar -xvfz htdig-3.1.5.tar.gz
cd htdig-3.1.5
./configure
make
make install
Configuring htdig
Once you have htdig installed, you must make a few changes to the configuration file and the HTML templates into which the search results are embedded.
Configuration File
The configuration file for htdig is located at/opt/www/htdig/conf/htdig.conf
.
It is pretty self-explanitory. The main attributes you need to
configure are as follows. It will work if you leave the defaults for
the other options or change them if you wish.Attribute | Value | Example |
start_url | URL of your site | http://www.mywebsite.com |
exclude_urls | Directories you do not want searched separated by white spaces | /cgi-bin/ /testing/ |
adminstrator | Email address of administrator | admin@mywebsite.com |
search_results_header | HTML file to be used as header of search results. Only use this if
you don’t want to use the default location for the header file: /opt/www/htdig/common/header.html |
/home/httpd/search/header.html |
search_results_footer | HTML file to be used as footer of search results. Only use this if
you don’t want to use the default location for the header file: /opt/www/htdig/common/footer.html |
/home/httpd/search/footer.html |
nothing_found_file | HTML file to be displayed if there is no match to search string
entered. Only use this if you don’t want to use the default location
for the header file: /opt/www/htdig/common/nomatch.html |
/home/httpd/search/nomatch.html |
syntax_error_file | HTML file to be displayed if there is a syntax error in the search
string entered. Only use this if you don’t want to use the default
location for the header file: /opt/www/htdig/common/syntax.html |
/home/httpd/search/syntax.html |
If you don’t want to use the default look-and-feel of htdig, you can edit the following files to use the look-and-feel of your website. The paths may be different if you choose to change the paths of them in your configuration file.
/opt/www/htdig/common/header.html
/opt/www/htdig/common/footer.html
/opt/www/htdig/common/nomatch.html
/opt/www/htdig/common/syntax.html
- Next, you must setup the search database by running the script
/opt/www/htdig/bin/rundig
. - Copy the default
search.html
and images from/opt/www/htdocs/htdig
to a directory namedhtdig
off of your webRoot. If the images are not in this directory, they will not appear unless you configure it otherwise ithtdig.conf
. - Copy
/opt/www/cgi-bin/htsearch
to thecgi-bin
for your webserver. - Test the search engine by opening
search.html
in your browser and entering a search string. - Because the search engine uses a database to return results, the database must be rebuilt with the
rundig
command used in step 1 every time any pages are added to the website. - If you want to configure anything else, refer the the htdig website. Pretty much everything is configurable with htdig.
No comments:
Post a Comment