Unit - 5
Parallel Databases
The following are the most typical architectures for implementing multi-user database management systems:
● Teleprocessing
● File-Server
● Client-Server
Teleprocessing
There is a single computer with a single CPU and several terminals.
Within the same physical computer, processing takes place. User terminals are often "dumb," unable to function alone, and are connected to a central computer.
Fig 1: Teleprocessing
File server
● Several workstations are connected to a file server through a network.
● The database is stored on a file server.
● On each workstation, a database management system (DBMS) and applications are installed.
● The following are some disadvantages:
○ There is a lot of network traffic.
○ Each workstation has a copy of the DBMS.
○ More difficult to control concurrency, recovery, and integrity.
File server architecture
The processing in a file-server system is distributed over the network, which is usually a local area network (LAN).
Fig 2: File server
Traditional Two - tier Client Server architecture
The client (tier 1) is in charge of the user interface and the execution of applications.
The database and database management system (DBMS) are stored on the server (layer 2).
● Wider access to existing databases;
● Enhanced performance;
● Probable cost savings on hardware;
● Cost savings on communication;
● Increased consistency are just a few of the benefits.
Fig 3: Traditional two-tier client server architecture
At least one database instance and one database make up an Oracle Database. Memory and processes are managed by the database instance. The database can be a non-container database or a multitenant container database, and it is made up of physical files called data files. During the operation of an Oracle Database, various database system files are used.
One database instance and one database make up a single-instance database design. The database and the database instance have a one-to-one relationship. On the same server system, many single-instance databases can be installed. For each database, there are different database instances. This configuration is handy for running multiple Oracle Database versions on the same machine.
Oracle Real Application Clusters (Oracle RAC) databases are made up of many instances that run on different server machines. They all have access to the same database. On one end, the cluster of server machines appears as a single server, while on the other end, end users and applications see it as a single server. This setup is optimized for excellent availability, scalability, and overall performance.
The database server process is the listener. It receives client requests, establishes a database connection, and then passes the client connection to the server process. The listener can be run locally or remotely on the database server. Oracle RAC environments are typically run remotely.
Fig 4: This interactive diagram shows the Oracle Database 18c technical architecture
In the early days of computers, technology simply improved the efficiency of manual procedures. New innovations enabled new capabilities and procedures in the company that were driven by IT as technology progressed. IT gradually altered the business, although not always in a way that was in line with the company's strategy. This misalignment resulted in considerable resource waste and missed opportunities, as well as putting the company at a competitive disadvantage in the marketplace. Enterprise Architecture is a novel strategy to manage IT that has been designed to connect business and IT strategies.
Enterprise architecture
Enterprise Architecture (EA) is a method and organizational principle for aligning functional business goals and strategies with an IT strategy and implementation plan. Enterprise Architecture serves as a roadmap for organizations' technological progress and change. As a result, IT becomes a more strategic asset for achieving a modern corporate strategy.
Typical outcomes from an Enterprise Architecture are:
● Enterprise Architecture Model in Its Current State
● The reference model for Future State Enterprise Architecture is required to carry out the proposed business strategy.
● Gap analysis that highlights the present state's shortcomings in terms of its capacity to support the business Architecture Roadmap's objectives and strategies, as well as the activities required to migrate from the current to the future state.
An EA ensures that the corporate goals and objectives are handled holistically across all IT initiatives by taking an enterprise-wide perspective spanning business services, business processes, information, applications, and technology.
Enterprise architecture is about continual communication between business and IT leaders as much as it is about technological advancements and architectural decisions.
The Oracle Enterprise Architecture Framework
Oracle built a hybrid EA framework influenced by TOGAF, FEA, and Gartner in order to deliver an efficient, business-driven framework to help our customers align their IT and business strategy. The Oracle Enterprise Architecture is a simple yet useful and prescriptive framework (OEAF). With unambiguous mappings to TOGAF and FEA, the OEAF complements other EA frameworks, allowing clients to utilize the EA framework of their choice. The OEAF was created with the goal of combining the benefits of several industry frameworks with Oracle's experience in producing enterprise solutions.
The Oracle Enterprise Application Framework's fundamental premise is to give "just enough" structure that may be constructed "just in time" to satisfy the organization's business requirements. Furthermore, the OEAF provides a well-known architectural platform for sharing Oracle's considerable intellectual capital surrounding enterprise IT solutions with customers and partners, increasing Oracle's strategic business value proposition.
There are nine essential values in the Oracle Enterprise Application Framework.
● Business strategy is at the heart of all we do.
● The technical architecture is standardized and simplified.
● Modeling that is "just enough" for enterprise solution architecture efforts.
● Recycles industrial and commercial suppliers' best-practice business models and reference architectures.
● For high-level guidance, the first focus is on speed of delivery.
● Collaboration with business owners, stakeholders, and expert architects was used to create this project.
● For breadth and depth, it is developed repeatedly and evolves through time.
● It is possible to enforce.
● It is technology agnostic, however it does make use of Oracle's knowledge and intellectual property.
These concepts serve as the foundation for mapping business needs to IT execution in agile enterprise architecture.
Components of Oracle Enterprise Architecture Framework
There are seven main components that make up the Oracle Enterprise Architecture Framework.
Business architecture
Business Architecture should be the starting point for every architectural debate. The Business Architecture links an organization's operating model, plans, and objectives with IT, as well as creating a business case for IT reforms and providing a functional view of the enterprise.
Fig 5: Oracle Enterprise Architecture Framework Components
Application architecture
In accordance with the application strategy, the Application Architecture provides an application- and services-centric view of an organization that relates business operations and services to application processes and services to application components. The scope, strategy, and standards of the Application Architecture are all influenced by the Business Architecture.
Information architecture
The Information Architecture outlines all of the moving components involved in managing information across the enterprise and sharing it with the right people at the right time in order to achieve the business goals outlined in the business architecture.
Technology Architecture
The Technology Architecture specifies the organization of the infrastructure that supports the business, application, and information architectures.
People, Process, and Tools
The people, methods, and technologies utilized to define enterprise architectures and architecture solutions are identified in this section of the framework.
People - Enterprise architecture teams and individuals are charged with a variety of activities, including architectural creation, management, implementation, and governance.
Process - A set of architectural processes chosen and followed to drive an architecture engagement down a path that increases the likelihood of a successful implementation while minimizing resource expenditure.
Tools - A combination of tools and technology that let you create and manage enterprise architecture faster. The majority of these products are in the modeling, portfolio management, and architecture asset repositories categories (for example, ARIS IT Architect, Oracle BPA Suite) (for example, Oracle Enterprise Repository).
Enterprise Architecture Governance
The framework and mechanisms for achieving an organization's business strategy and objectives through an Enterprise Architecture are provided by Enterprise Architecture governance. During IT transformations and solution deployments, an EA governance body is utilized to guide each project and assure alignment with the EA.
The following are some criteria for evaluating the performance of parallel databases:
- Response time - It is the amount of time it takes to execute a single task in a particular amount of time.
- Speed up in the parallel database - The process of raising the degree of (resources) parallelism in order to accomplish a running job in less time is known as speeding up.
The time it takes to complete a task is inversely related to the quantity of resources available.
Formula:
Speed up = TS / TL
Where,
TS = Time required to execute task of size Q
TL = Time required to execute task of size N*Q
● Linear speed-up is N (Number of resources).
● Speed-up is sub-linear if speed-up is less than N.
Fig 6: Speed up in parallel database
3. Scale up in the parallel database - When the number of processes and resources rises correspondingly, scale-up refers to the ability to maintain performance.
Formula:
Let Q be the Task and QN the task where N is greater than Q
TS = Execution time of task Q on smaller machine MS
TL = Execution time of task Q on smaller machine ML
Scale Up = TS /TL
Fig 7: Scale up in parallel database
A parallel database management system (DBMS) is a database management system that works on several processors or CPUs and is primarily designed to execute query operations in parallel whenever possible. The parallel DBMS connects several smaller machines together to obtain the same throughput as a single large system.
There are three architectural ideas for parallel DBMS in Parallel Databases. The following are the details:
● Shared Memory Architecture
● Shared Disk Architecture
● Shared Nothing Architecture
Shared Memory Architecture
Multiple processors are connected to a global shared memory via an intercommunication channel or communication bus in a shared memory system.
Because each processor in a shared memory system has a considerable quantity of cache memory, referencing the shared memory is avoided.
When a processor conducts a write operation to a memory location, the data in that location should be updated or erased.
Fig 8: Shared memory system in parallel database
Advantages
● Any processor can simply access data.
● One processor can efficiently convey messages to another.
Disadvantages
● Due to the increasing number of processors, the processing time has increased.
● There's a bandwidth issue.
Shared Disk Architecture
A shared disk system has many processors, each of which has local memory and is connected to numerous disks via an intercommunication channel.
Because each processor has its own memory, data sharing is quick.
Clusters are the systems that are constructed around this system.
Fig 9: Shared disk system in parallel database
Advantages
● A shared disk system is used to achieve fault tolerance.
● If one CPU or its memory fails, the task can be completed by another processor. Fault tolerance is the term for this.
Disadvantages
● Because a substantial amount of data flows across the interconnection channel, the scalability of a shared disk system is constrained.
● Existing processors are delayed as more processors are added.
Application
Digital Equipment Corporation (DEC): DEC cluster running relational databases use the shared disk system and now owned by Oracle.
Shared Nothing Architecture
In the shared nothing system, each CPU has its own local memory and disk.
Intercommunication channels allow processors to communicate with one another.
Any CPU can be used as a server to serve data from a local disk.
Fig 10: Shared nothing system in parallel database
Advantages
In a share nothing disk system, the number of processors and disks can be connected as needed.
The shared nothing disk system can handle several processors, making it more scalable.
Disadvantages
● In a shared nothing disk system, data partitioning is required.
● Communication costs for accessing a local disk are substantially higher.
Application
● Database machine with terabytes of data.
● The research prototypes Grace and Gamma.
Key takeaway:
- A parallel database management system (DBMS) is a database management system that works on several processors or CPUs and is primarily designed to execute query operations in parallel whenever possible.
- The parallel DBMS connects several smaller machines together to obtain the same throughput as a single large system.
- Multiple processors are connected to a global shared memory via an intercommunication channel or communication bus in a shared memory system.
- A shared disk system has many processors, each of which has local memory and is connected to numerous disks via an intercommunication channel.
- In the shared nothing system, each CPU has its own local memory and disk.
The following are the two methods for evaluating queries:
- Inter query parallelism
This method allows you to run many queries on several processors at the same time.
Inter query parallelism is used to achieve pipelined parallelism, which increases the system's output.
For instance, if there are six queries, each will take three seconds to evaluate. As a result, the entire evaluation procedure took 18 seconds to complete. This task takes only 3 seconds thanks to inter query parallelism.
Inter query parallelism, on the other hand, is tough to achieve every time.
2. Intra Query Parallelism
This approach divides a question into sub queries that can run on many processors at the same time, reducing query evaluation time.
Intra-query parallelism enhances the system's response time.
For instance, if we have 6 queries, each of which takes 3 seconds to evaluate, the total time to finish the assessment process is 18 seconds. However, because each question is separated into sub-queries, we can complete this operation in under 3 seconds utilizing intra query evaluation.
Optimization of Parallel Query
● Selecting the most effective query evaluation strategy is what parallel query optimization is all about.
● Parallel query optimization is critical in the development of systems that reduce the cost of query evaluation.
Two factors play a very important in parallel query optimization.
a) the overall amount of time it took to discover the optimal plan.
b) the length of time it will take to carry out the plan.
Query Optimization is done with an aim to:
● Find the queries that can produce the fastest results on execution to speed up the inquiries.
● Increase the system's performance.
● Choose the optimal query evaluation strategy.
● Avoid the unwelcomed strategy.
Virtualization on multicore processors
● By adding more CPUs, the computer's processing power is boosted. Virtualization is the name given to this method.
● Multiple CPUs are added to the host machine, which improves the system's performance.
● There are no set guidelines for increasing the number of CPUs.
● Multicore processors are capable of solving complex processing problems.
● This method is employed in the processing of big loads.
The diagram below shows a virtualization on a multicore processor.
Fig 11: Virtualization on multicore processor
Key takeaway
- Inter query parallelism is used to achieve pipelined parallelism, which increases the system's output.
- Intra-query parallelism enhances the system's response time.
- By adding more CPUs, the computer's processing power is boosted. Virtualization is the name given to this method.
- Multiple CPUs are added to the host machine, which improves the system's performance.
References:
- A. Silberschatz, H.F. Korth and S. Sudarshan, “Database System Concepts”, McGraw Hill, 6th Edition.
- Martin Gruber, “Understanding SQL”, Sybex Publications.
- Ivan Bayross, “SQL- PL/SQL”, BPB Publications, 4th Edition.
- S.K. Singh, “Database Systems: Concepts, Design and Application”, Pearson, Education, 2nd Edition.