Copyright Brief Table of Contents Table of Contents Foreword Preface Acknowledgments About this Book About the Authors About the Cover Illustration
Copyright
Brief Table of Contents
Table of Contents
Foreword
Preface
Acknowledgments
About this Book
About the Authors
About the Cover Illustration
1. Getting started Chapter 1. The case for the digital Babel fish Chapter 2. Getting started with Tika Chapter 3. The information landscape 2. Tika in detail Chapter 4. Document type detection Chapter 5. Content extraction Chapter 6. Understanding metadata Chapter 7. Language detection Chapter 8. What’s in a file? 3. Integration and advanced use Chapter 9. The big picture Chapter 10. Tika and the Lucene search stack Chapter 11. Extending Tika 4. Case studies Chapter 12. Powering NASA science data systems Chapter 13. Content management with Apache Jackrabbit Chapter 14. Curating cancer research data with Tika Chapter 15. The classic search engine example
1. Getting started
Chapter 1. The case for the digital Babel fish Chapter 2. Getting started with Tika Chapter 3. The information landscape
Chapter 1. The case for the digital Babel fish
Chapter 2. Getting started with Tika
Chapter 3. The information landscape
2. Tika in detail
Chapter 4. Document type detection Chapter 5. Content extraction Chapter 6. Understanding metadata Chapter 7. Language detection Chapter 8. What’s in a file?
Chapter 4. Document type detection
Chapter 5. Content extraction
Chapter 6. Understanding metadata
Chapter 7. Language detection
Chapter 8. What’s in a file?
3. Integration and advanced use
Chapter 9. The big picture Chapter 10. Tika and the Lucene search stack Chapter 11. Extending Tika
Chapter 9. The big picture
Chapter 10. Tika and the Lucene search stack
Chapter 11. Extending Tika
4. Case studies
Chapter 12. Powering NASA science data systems Chapter 13. Content management with Apache Jackrabbit Chapter 14. Curating cancer research data with Tika Chapter 15. The classic search engine example
Chapter 12. Powering NASA science data systems
Chapter 13. Content management with Apache Jackrabbit
Chapter 14. Curating cancer research data with Tika
Chapter 15. The classic search engine example
Appendix A. Tika quick reference Appendix B. Supported metadata keys
Appendix A. Tika quick reference
Appendix B. Supported metadata keys
Index
List of Figures
List of Tables
List of Listings