Japanese

Wire-Swig

Wire-Swig provides libwire binding for some languages. libwire is a part of WIRE, an implementation of Web crawler. The software forcuses on retrieve information from WIRE's storage and index.

Download

  • wire-swig-0.20.tar.gz
    • MD5: 9d9ba0fbb9b2493bdd2e42b35ff0656e
    • SHA1: 7f0aff591374a133157c8dfb54293d9a12189242
  • wire-swig-0.10.tar.gz
    • MD5: 6bcef8aa045b4de2cb1b591e01bbd23d
    • SHA1: 59d9279d8123f5951f764fbd2103b075de3f1332

Requirements

and... Ruby, Python, Perl, or other language supported by SWIG. The software is tested on Ruby 1.8.2 and Python 2.3.5.

Example

Ruby

# example.rb
require 'Wire'
ENV['WIRE_CONF'] = '/path/to/wire.conf'
idxdir = '/path/to/wire/index'
idx = Wire::Index.new(idxdir)
1.upto(idx.count_doc) do |i|
  d = idx.doc_retrieve(i)
  next unless d.mime_type == Wire::MIME_TEXT_HTML
  print idx.url_by_docid(i)
  print idx.retrieve_text_by_docid(i)
end

Python

# example.py
import Wire
import os
os.environ['WIRE_CONF'] = '/path/to/wire.conf'
idxdir = '/path/to/wire/index'
idx = Wire::Index(idxdir)
for i in range(1, idx.count_doc()):
  d = idx.doc_retrieve(i)
  if d.mime_type == Wire.MIME_TEXT_HTML:
    print idx.url_by_docid(i)
    print idx.retrieve_text_by_docid(i)

License

Copyright 2006 NOKUBI Takatsugu <knok@daionet.gr.jp>

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

ToDo?

  • charset enum support
  • read base dir from config