Use mysolr.Solr object to connect to a Solr instance.
from mysolr import Solr
# Default connection. Connecting to http://localhost:8080/solr/
solr = Solr()
# Custom connection
solr = Solr('http://foo.bar:9090/solr/')
New in version 0.9.
You can reuse HTTP connection by using requests.Session object
from mysolr import Solr
import requests
session = requests.Session()
solr = Solr('http://localhost:8983/solr/collection1', make_request=session)
New in version 0.9.
Using a requests.Session object allows you to connect to servers secured with HTTP basic authentication as follows:
from mysolr import Solr
import requests
session = requests.Session()
session.auth = ('admin', 'admin')
solr = Solr('http://localhost:8983/solr/collection1', make_request=session)
New in version 0.8.
Solr 4.0 changed a bit the api so, Solr object will guess the solr server version by making a request. You can manually set the solr version with the paremeter version
from mysolr import Solr
# Default connection. Connecting to a solr 4.X server
solr = Solr(version=4)
Making a query to Solr is very easy, just call search method with your query.
from mysolr import Solr
solr = Solr()
# Search for all documents
response = solr.search(q='*:*')
# Get documents
documents = response.documents
Besides, all available Solr query params are supported. So making a query using pagination would be as simple as
from mysolr import Solr
solr = Solr()
# Get 10 documents
response = solr.search(q='*:*', rows=10, start=0)
Some parameters contain a period. In those cases you have to use a dictionary to build the query:
from mysolr import Solr
solr = Solr()
query = {'q' : '*:*', 'facet' : 'true', 'facet.field' : 'foo'}
response = solr.search(**query)
Sometimes specifying a HTTP parameter multiple times is needed. For instance when faceting by several fields. Use a list in that case.:
from mysolr import Solr
solr = Solr()
query = {'q' : '*:*', 'facet' : 'true', 'facet.field' : ['foo', 'bar']}
response = solr.search(**query)
The typical concept of cursor in relational databases is also implemented in mysolr.
from mysolr import Solr
solr = Solr()
cursor = solr.search_cursor(q='*:*')
# Get all the documents
for response in cursor.fetch(100):
# Do stuff with the current 100 documents
pass
This is a query example using facets with mysolr.
from mysolr import Solr
solr = Solr()
# Search for all documents facets by field foo
query = {'q' : '*:*', 'facet' : 'true', 'facet.field' : 'foo'}
response = solr.search(**query)
# Get documents
documents = response.documents
# Get facets
facets = response.facets
Facets are parsed and can be accessed by retrieving facets attribute from the SolrResponse object. Facets look like this:
{
'facet_dates': {},
'facet_fields': {'foo': OrderedDict[('value1', 2), ('value2', 2)]},
'facet_queries': {},
'facet_ranges': {}
}
Ordered dicts are used to store the facets because order matters.
In any case, if you don’t like how facets are parsed you can use raw_content attribute which contains the raw response from solr.
This is an example of a query that uses the spellcheck component.
from mysolr import Solr
solr = Solr()
# Spell check query
query = {
'q' : 'helo wold',
'spellcheck' : 'true',
'spellcheck.collate': 'true',
'spellcheck.build':'true'
}
response = solr.search(**query)
Spellchecker results are parsed and can be accessed by getting the spellcheck attribute from the SolrResponse object.:
{'collation': 'Hello world',
'correctlySpelled': False,
'suggestions': {
'helo': {'endOffset': 4,
'numFound': 1,
'origFreq': 0,
'startOffset': 0,
'suggestion': [{'freq': 14,
'word': 'hello'}]},
'wold': {'endOffset': 9,
'numFound': 1,
'origFreq': 0,
'startOffset': 5,
'suggestion': [{'freq': 14, 'word': 'world'}]}}}
stats attribute is just a shortcut to stats result. It is not parsed and has the format sent by Solr.
Like stats, highlighting is just a shortcut.
As mysolr is using requests, it is posible to make concurrent queries thanks to grequest
from mysolr import Solr
solr = Solr()
# queries
queries = [
{
'q' : '*:*'
},
{
'q' : 'foo:bar'
}
]
# using 10 threads
responses = solr.async_search(queries, size=10)
See installation section for further information about how to install this feature.
from mysolr import Solr
solr = Solr()
# Create documents
documents = [
{'id' : 1,
'field1' : 'foo'
},
{'id' : 2,
'field2' : 'bar'
}
]
# Index using json is faster!
solr.update(documents, 'json', commit=False)
# Manual commit
solr.commit()