The Open Archives Initiative is an organization dedicated to making digital libraries interoperable and has developed a standard way to access the data stored in a single archive. Many projects at VT and elsewhere are looking into building effective search engines for the data that is OAI-accessible. For an example of such a service, see ODU's ARC.
As part of the ETD Union Catalog experiments, we are attempting to build a suite of cross-archive search engines for Electronic Theses and Dissertations. However, since many partners in these efforts are non-English-speaking, the data is in various languages and searching does not produce the expected results. As an example, if there are 2 documents about computers, one each in English and Afrikaans, the English one would use the word "computer" but the Afrikaans one would use the word "rekenaar". Now, a person looking for documents might be interested in both, but if this person spoke English he or she would enter "computer" and get one result with the same happening if the person spoke Afrikaans. This is a problem.
One solution would be to translate all the documents before building the search engine's index, and then translate all the queries before searching. You may either modify an existing free search engine, write your own, or use a template that will be provided. The search engine must gather data using the OAI protocol and must be accessible through an OAI-like interface (the demo code does this). For translations, you may use the GPLTrans open source toolkit or any other that is available locally.
As part of this project you will have to deal with the ways in which multiple languages are encoded in a single document. Sometimes it's separated by language but sometimes its mixed and you may need to heuristically determine the language boundaries.
Note: One person once asked at a meeting "why would you want to find something you can't read?". The simple answer is that we can then use a machine translation tool to get the gist of the document. See FreeTranslation for one such service that is completely free!