I didn’t know anything about IBM’s Domino – or is it Lotus? Or is it Notes? Or some combination of all three? I will attempt to clarify what is named what in a second here, for now let’s just call it Lotus Domino. Coming from a mainly Microsoft focused background, I didn’t know anything about Lotus Domino until a month back when as part of a project I had to extract some documents out of it. I am not talking about email attachments or anything to do with email here. Part of the problem when researching about Lotus Domino API is that the content that turns up usually refers to email as opposed to document management. Also, whatever content that you do come across on the web is not comprehensive nor is it all in one place, so I had to piece together bits of information to complete my task. As you will notice the code by itself is not complicated but understanding the Domino architecture, taxonomy and the API takes time. So I pieced together the fruits of my research in this article that will briefly explain how to extract documents from a Domino database. Following are a few helpful facts before we get to the code.
What is named what? As a result of acquisitions, merging products or to simply distinguish client from a server, IBM has managed to confuse people with the variable names. The Notes/Domino product suite is a client-server model where Lotus Notes is the client.Lotus Domino is the server. This is analogous to Microsoft’s email product where the client is called Outlook and the server is called Exchange. However, unlike Outlook, Lotus Notes is billed as an application platform so apart from email it can also be used as client to access document management capabilities within the Domino server.
What is Domino.Doc? Adding to the confusion is yet another product called Lotus Domino Document Manager or Domino.Doc. Thankfully, this is retired (although it is replaced by yet another product called Lotus Quickr). Domino.Doc had only document management capabilities. In this project I was asked to work with Lotus Domino 8.5.1 (Enterprise edition) so the code here should work for this and higher versions.
Ultimately where do documents reside and how to access them? Once we navigate through the maze of all these names, products, applications and add-ins that Lotus Domino/Notes suite offers, it will become apparent that core document management capabilities are provided by the Domino server. As mentioned earlier document management is just one of the features provided by Lotus Domino. It also provides other features like email, calendars, discussions and workflows. Documents are stored in non-relational document-centric file based databases which have extensions “.nsf”. Since these are not relational databases you will not be able to query then like you would a SQL Server database. There is an API that we can use to programmatically access the .nsf database for emails, calendars, documents etc.
Can we not just use the Lotus Notes/Domino web services? While researching I learnt that Lotus Domino provides web services. That would make life so easy if we could just call the out-of-the-box web services to access Domino like we can with SharePoint, TFS and Exchange. Unfortunately, Lotus Domino merely provides the infrastructure to write and configure web services. There are no out-of-the-box pre-written web services that we can simply call. Since we still have to provide the meat, we need to know the internals of the product and the API anyway. Now this fact is not apparent in what little documentation IBM has put out there.
Do I need a server with Domino installed to use the COM based API? Yes. I have read somewhere that you can get away with running your code on a box where either the Domino client or server is not installed as long as you register one dll. In my experience that is not true. This one dll in question has dependencies with other dlls so you may end up having to register a whole lot of them to get the API to work. It will be much easier to install at least the Domino client or seek out a box where it is already installed and run your code there else you will run into all sorts of COM errors.
So where is this API and how can I access it? The LotusScript/COM/OLE classes are documented here. General information on accessing Domino objects via COM is described here. The hardest part is getting the actual COM library itself. It’s nowhere to be found in the official documentation sites or the installation folders. I (as did others) found it as part of the source download in this CodeProject article. I have attached the dll to this blog post for convenience.
Calling the COM API from C# to extract documents
As the heading of this post notes the purpose here is to query the Lotus Domino database and extract documents that I save to a local folder. I also need to get the attributes of the document that I just extracted. In my project I copy these attributes to a local text file in XML format but you could just as easily copy them to a database or elsewhere if you would like. Make sure you have downloaded and copied the Interop.Domino.dll that I uploaded here. Here is a step by step description.
1. After creating the project in Visual Studio, add a reference to the Interop.Domino.dll by browsing to the location where you copied it. Note that there is no need to register this dll. This is simply an interface to the native API. This dll is dependent though on other dlls that would have already registered as part of the Domino installation.
2. Initialize a COM session. You can use the Initialize method of the NotesSessionClass and pass a password (it assumes the user from the context the code is running under) or you can call the InitializeUsingNotesUserName and explicitly pass a user name and password that has access to Domino.
var dominoSession = new Domino.NotesSessionClass();
// dominoSession.Initialize(Password);
dominoSession.InitializeUsingNotesUserName(UserName, Password);
3. Create a Domino database object and open the database (which is a .nsf file as mentioned earlier). To the GetDatabase function, you can pass empty string if the code is running on the same server as the database. To the database parameter pass the full path and file name of the .nsf file (e.g. “C:\Domino\Data\AdminDB.nsf”). The third parameter is optional and if set to true will create a database object even if opening the
database fails.
var dominoDatabase = dominoSession.GetDatabase(server name, full database file path, false);
if (dominoDatabase == null)
{
throw new Exception(“Unable to create Domino database object.”);
}
4. At this point a session has been created and a reference to the database obtained. Next we need to query the database for documents. We can call AllDocuments of the NotesDatabase class to get all documents in the database. However, here I need documents which were created or modified on or after a certain date so I will call the GetModifiedDocuments method and pass a date parameter.
// var documentCollection = dominoDatabase.AllDocuments;
Domino.NotesDateTime lastCheckedDate = dominoSession.CreateDateTime(“3/5/2010 12:10:00 AM”);
var documentCollection = dominoDatabase.GetModifiedDocuments(lastCheckedDate);
if (documentCollection == null)
{
throw new Exception(“Unable to get documents.”);
}
5. Now comes the interesting part where we enumerate the documents to read their properties and get the content. But before we get into the code, I need to explain what a “document” means in Domino. In Lotus Domino a document is defined differently compared to the traditional Windows document. In Windows, a document or file that resides on the drive and accessed via say Explorer has properties and content associated with it. In Domino a document is a higher level entity (think of it as the root or parent object) that can contain one or more attachments. These attachments are objects that have content. A document can have attributes like id, create date, subject, author(s), http and Notes url’s etc. but not the content itself. Attachment(s) belonging to a document will have a name and content. A diagram may help explain the concept better. Below I show a document identified by its id and it contains two attachments – one is a MS Word document and the other is a pdf.
6. The following code iterates through the document collection and reads a few of its attributes. What I have realized is a document has two types of attributes. A main set of attributes that can be read directly via properties of the NotesDocument class as you will notice below. A second set which I call “extended attributes” can be read by accessing the Items property of the NotesDocument class. If the document contains attachments then one or more of the items will have its type set to “attachment”. If we encounter an attachment type, we can then get its name and its content as I show in the code. If the type is not an attachment it will most likely be an extended attribute. Dumping these extended attributes into a XML file, it got me attributes like document category, area and region in the taxonomy, its root name/title, approver names, publisher name, revision number, effective/expired dates etc.
var document = documentCollection.GetFirstDocument();
while (document != null)
{
if (document.IsValid
&& (!document.IsDeleted)
&& (document.HasEmbedded))
{
if (document.Created != null)
{
string createDate = document.Created.ToString();
}
if (document.LastModified != null)
{
string LastModifiedDate = document.LastModified.ToString();
}
if (document.UniversalID != null)
{
string universalId = document.UniversalID;
}
if (document.HttpURL != null)
{
string httpUrl = document.HttpURL;
}
if (document.Authors != null)
{
object[] authors = (object[])document.Authors;
if (authors != null)
{
string[] authorsList = (string[])authors;
}
}
object[] items = (object[])document.Items;
if (items != null)
{
foreach (var item in items)
{
Domino.NotesItem notesItem = (Domino.NotesItem)item;
if (notesItem != null
&& notesItem.Values != null
&& !(notesItem.Values is string))
{
object[] itemValues = (object[])notesItem.Values;
if (itemValues != null)
{
if (notesItem.type == Domino.IT_TYPE.ATTACHMENT)
{
if (itemValues[0] != null && notesItem.Name != null)
{
string fileName = itemValues[0].ToString;
ExtractFileToLocalFolder(document, fileName);
}
}
else
{
foreach (object value in itemValues)
{
if (value != null)
{
string itemName = notesItem.Name;
string value = value.ToString();
// log to xml file or database
}
}
}
}
}
}
}
}
document = documentCollection.GetNextDocument(document);
}
7. The code for the ExtractFileToLocalFolder method is below. This method is called every time an attachment type is encountered while iterating through the items.
private void ExtractFileToLocalFolder(Domino.NotesDocument docment, string fileName)
{
var attachment = docment.GetAttachment(fileName);
attachment.ExtractFile(FilePath + @”\” + fileName);
}
Conclusion
As you may have noted the code is fairly straightforward. Comprehending the concepts and the taxonomy of Lotus Domino that is the tricky part. Finally, do note that this is not production code and I purposely simplified it for this article. Also, you may want to save the attributes – both the main ones and the extended ones – to a file or database. Here I just copy them to meaningless variables to keep the code clean.
EDIT: Though I have left it out for the sake of brevity, do remember to clean up the COM objects – documentCollection, dominoDatabase and dominoSession after you are done by calling Marshal.ReleaseComObject() on them preferably in a finally block.
