New modularity and indic language support

Bruno Lowagie interview: “iText 7 goes beyond PDF” | GIDS 2016

Geertjan Wielenga
iText

Bruno Lowagie, the original developer of iText, talks about the Java PDF library’s expansion, the upcoming release of iText 7 and what happens next.

Sitting in the lobby of the Sheraton Hotel in Bangalore, preparing for his keynote at the Great Indian Developer Summit (GIDS) 2016, Bruno Lowagie, the original developer of iText, spent a few moments describing the recent series of fundamental and revolutionary changes in the iText world. As most in the Java world should know by now, iText is a Java PDF library and it is open source, but there is a commercial offering too.

Over the past years, the iText company has grown exponentially, and has, for example, been awarded the Deloitte Technology Fast50 Award for being the fastest growing technology company in Belgium. As well, and for some time already, iText has wanted to broaden its offering. In July 2015, iText talked with a customer, Hancom, a South Korean company of productivity software. While sharing roadmaps, they saw that they were very complementary, i.e., Office formats on one hand and PDF on the other. They immediately recognized the good match and so decided to join forces. Hancom were looking to expand worldwide and established themselves in Belgium and bought iText to merge the two focus points.

Geertjan Wielenga: Wow, Bruno, so a lot has happened!

Bruno Lowagie: Yes, this is all very exciting for me. We had been growing and hiring for some time and now we’re experiencing a big jumpstart with a company with 1.200 employees, while we were a small company in Belgium where we had about 20 employees. So, this creates a lot of opportunities. We now have access to all of Hancom’s technologies, while they’re looking at us to prepare their products for the western market. Their new company ThinkFree in Belgium has a license to sell all their products in Europe and the US. In the next few years, we’ll release more and more products that will always be in the document business, which is where our “know how” lies, i.e., converting Office documents to PDF, running on Tomcat, so everything continues to be Java based.

There is much more to PDF than meets the eye.

Geertjan Wielenga: In other words, all of this is still a Java story and is a celebration of Java’s awesomeness?

Bruno Lowagie: Yes. iText is offered as Java library and a C-Sharp library, but the main development track continues to be Java. In fact, we jumped from iText 2 to iText 5 in 2009 to keep in sync with the release of Java 5. I know Java 5 was end of life, though we can’t migrate too quickly because the server market migrates very slowly through Java releases. Now we are jumping from iText 5 to iText 7 for the same reason, because we are in sync now with Java 7. Some people will see this as a missed opportunity for not having migrated to Java 8 immediately, though we can see our customers are not ready for that yet.

The iText 5 architecture was based on a design from the year 2000, which has worked for us very well for all those years, though we were reaching the limits of that architecture. One major problem that we had, the number 1 request from users, was support for Indic languages, such as Hindi. This required a complete rewrite of our font support, which was impossible to do in iText 5, we couldn’t bolt that language support onto the existing library without breaking a lot of the functionality. We saw this as an opportunity to come up with a new architecture that allows us to meet a lot of our other customer requests at the same time. For instance, EverNote is a customer that uses iText on Android, that reached the implementation limits of iText and so they’ll also benefit from the new architecture, which is more modular, enabling the user to select which parts they want to use. We cleaned up the API, in the past it wasn’t possible to see the difference between a class and an interface, and we have refactored things to make it more intuitive and consistent from a programming point of view.

Geertjan Wielenga: When is the release of iText 7?

Bruno Lowagie: We’re officially announcing this at GIDS in Bangalore, while we already have a number of test users to get feedback. The official release will be on May 1 this year.

Geertjan Wielenga: What’s happening after iText 7?

Bruno Lowagie: Because of the modular approach, iText will be easier to extend. In iText 7 we support Hindi and Tamil, though we know customers also want Telugu and other eastern languages. Thanks to the new architecture, these kinds of needs will be easier to meet.

The same is true for new standards. In iText 5 we already support PDF/A and we noticed that when we wanted to start supporting PDF/UA that we needed to improve the API. Based on our experience with standards that emerged after 2009, we decided to create iText 7 that is ready for those new standards, which is especially important for ISO-32000-2, also known as PDF 2.0. This new PDF version will be released in 2017 and we wanted to make sure that iText would be ready to meet the new standard is released.

The modular approach allows us to come up with value packs targeted at specific niche segments. For instance, in Germany there is a standard called ZUGFeRD, which is for invoicing for consumption by humans as well as machines. It is a combination of PDF/A-3 and the cross-industry invoice standard, which will be an add-on on top of iText 7. Another example of an add-on is pdfSweep, which allows you to remove content from a PDF, which is important for the medical sector where they want to anonymize documents. This plugin can evolve into PDF Redact, which will offer full fledged redaction functionality.

Geertjan Wielenga: Bruno, what is your role in this brave new world?

Bruno Lowagie: I started as the founder and the idea of the iText library was mine, while I also developed the first version, back in 2000. The initial iText company dates from 2008. I evolved into being the CEO of a group of companies with subsidiaries in the US, Belgium, and Singapore. Since two weeks, we’ve hired a new CEO and now I can focus on the strategy of the document business of iText and Hancom. So, now I am the Chief Strategy Officer. I still see myself as the evangelist, not only for iText, but for PDF in general. The co-operation with Hancom is a great opportunity and I’m really looking forward to the next few years to see what will happen.

We want to create the awareness that nothing in life is really free, someone always pays.

Geertjan Wielenga: Do all of these developments have an impact on the open source message and ecosystem at large?

Bruno Lowagie: Although we developed iText 7 from scratch, and we could have made it closed source, we decided to keep it open source. Nevertheless, we will have some modules that will be closed source, because they will be value add-ons. We have one closed product called XFAWorker, and so for the value add-ons we will decide whether to make them open source or not. When it is something very specific, when we don’t expect to have many core contributions, we will keep it closed source. Why closed source? Because we have noticed that there is a lot of ignorance about the fact that the AGPL has specific requirements if you want to use the software for free. By making small parts of the software closed source, we hope that developers will pay more attention to licensing. We want to create the awareness that nothing in life is really free, someone always pays.

Geertjan Wielenga: So, in summary, wow, iText is exploding and an even more massive success than one could have imagined?

Bruno Lowagie: If you had told me about the above developments 5 years ago, I would not have believed you, yes. :-) There is much more to PDF than meets the eye and so that’s why we’re involved in the standardization process, while our success in the PDF world has allowed us to look beyond PDF to see where the role of PDF fits into the total document system.

Congratulations on the success of iText and all the best for the GIDS 2016 keynote on these inspiring developments.

Author

Geertjan Wielenga

Geertjan is an open source evangelist, working at Oracle, on products such as Oracle JET and NetBeans IDE.


Comments
comments powered by Disqus