PostgreSQL (pronounced Post-Gres-Q-L) or postgres is an open source object relational database management system. It emphasizes extensibility, creativity, as well as compatibility. It competes with the major relational database vendors such as Oracle, MySQL, SQL servers, and others. It is used by different sectors including government agencies and the public and private sectors. It is a cross-platform DBMS, and runs on most modern operating systems including Windows, MAC, and Linux flavors. It conforms to SQL, and is ACID compliant.
PostgreSQL (http://www.postgresql.org/) has many rich features. It provides enterprise level services including performance, unique features, and scalability. It has a very supportive community and very good documentation.
The name PostgreSQL is derived from post-ingres database. PostgreSQL history could be summarized as follows:
PostgreSQL begun at version 6, having a very strong starting point due to the advantage of several years of research and development. Being an open source project with a very good reputation, PostgreSQL attracted hundreds of developers. Currently, PostgreSQL has an uncountable number of extensions and a very active community.
PostgreSQL provides many features that attract developers, administrators, architects, and companies.
PostgreSQL is a free open source software (OSS). It is released under the PostgreSQL license, which is similar to the BSD and MIT licenses. The PostgreSQL license is highly permissive, and PostgreSQL is not subject to monopoly and acquisition. This gives the company the following advantages:
PostgreSQL is very attractive for developers, administrators, and architects. It has rich features that enable developers to perform tasks in an agile way. The following are some of the features that are attractive to developer:
PostgreSQL can be used with a variety of applications. The main PostgreSQL application domains can be classified into two categories:
OLTP is used to model business operations such as customer relationship management (CRM). For example, the car web portal example in Chapter 1, Relational Databases, is an example of an OLTP application. OLAP applications are used for business intelligence, decision support, reporting, and planning. An OLTP database size is relatively small as compared to an OLAP database. OLTP normally follows relational model concepts, such as normalization ,when designing the database, while OLAP has less relation; the schema often has the shape of a star or a snowflake. Finally the data is deformalized.
In the car web portal example, we could have another database to store and maintain all the sellers' and users' historical data to analyze user preferences and seller activities. This database is an example of an OLAP application.
Unlike OLTP, OLAP's main operation is data retrieval. OLAP data is often generated by a process called ETL (extract, transform, and load). ETL is used to load data in to the OLAP database from different data sources and different formats.
PostgreSQL can be used out of the box for OLTP applications. For OLAP, there are many extensions and tools to support it, such as the PostgreSQL COPY command and Foreign Data Wrappers (FDW).
PostgreSQL is used in many application domains including communication, medical, geographical, and e-commerce applications. Many companies provide consultation as well as commercial services, such as migrating proprietary RDBMS to PostgreSQL in order to cut off the licensing costs. These companies often influence and enhance PostgreSQL by developing and submitting new features.
The following are a few companies that have used PostgreSQL:
In addition to the companies mentioned in the preceding list, PostgreSQL is used by HP, WMware, and Heroku. PostgreSQL is used by many scientific communities and organizations, such as NASA, due to its extensibility and rich data types.
There are more than 20 PostgreSQL forks; PostgreSQL extensible APIs make postgres a great candidate for forking. Over the years, many groups forked PostgreSQL and contributed their findings to PostgreSQL. The following is a list of the popular PostgreSQL forks:
PostgreSQL uses the client/server model, where the client and server programs can be on different hosts. The communication between the client and server is normally done via TCP/IP protocols or via Linux sockets. PostgreSQL can handle multiple connections from a client. A common PostgreSQL program consists of the following operating system processes:
PostgreSQL forks a new process for each new connection; thus, client and server processes communicate with each other without the intervention of the server main process (postgres), and they have a certain lifetime that is determined by accepting and terminating a client connection.
The aforementioned abstract conceptual PostgreSQL architecture gives an overview of the PostgreSQL capabilities and its interaction with the client, and the operating system. The PostgreSQL server could be divided roughly into four subsystems, as follows:
Almost all PostgreSQL components can be configured, including a logger, planner, statistical analyzer, and storage manager. PostgreSQL configuration is governed by the nature of the application, such as OLAP and OLTP.
PostgreSQL has a very cooperative, active, and organized community. In the last 8 years, the PostgreSQL community has published eight major releases. Announcements are brought to the developers via the PostgreSQL weekly newsletter. There are dozens of mailing lists organized into categories such as user, developer, and associations. Examples of user mailing lists are pgsql-general, psql-doc, and psql-bugs. pgsql-general is a very important mailing list for beginners. All non-bugs-related questions regarding PostgreSQL installation, tuning, basic administration, PostgreSQL features, and general discussions are submitted to this list.
The PostgreSQL community runs a blog aggregation service called Planet PostgreSQL (planet.postgresql.org). Several PostgreSQL developers and companies use this service to share their experience and knowledge.