- The IBM Netezza appliance is a test and development system and packs the performance and simplicity of Netezza’s unique architecture into a compact footprint. The IBM Netezza appliance soffers customers an economical platform to develop and test their Business Intelligence (BI) and advanced analytic applications. It also shares the same characteristics as its enterprise-class counterpart of simplicity, ease of deployment and use and hardware-based acceleration of analytic queries and workloads.
What is Netezza used for ?
- The IBM Netezza appliance also includes a SQL dialect called Netezza Structured Query Language (NZSQL). You can use SQL commands to create and manage your Netezza databases, user access, and permissions for the databases, as well as to query and modify the contents of the databases.
Simplicity
- The IBM Netezza is an easy-to-use appliance that requires minimal tuning and administration, speeding up application development. It is delivered ready-to-go for immediate data loading and query execution and integrates with leading ETL, BI and analytic applications through standard ODBC, JDBC and OLE DB interfaces.
Performance
- The IBM Netezza system’s performance advantage comes from IBM’s unique Asymmetric Massively Parallel Processing™ (AMPP™) architecture, which combines open, blade-based servers with commodity disk storage and patented data filtering using Field Programmable Gate Arrays (FPGAs).
- As an appliance that shares the same software and hardware architecture with other members of the IBM Netezza data warehouse appliance family, the IBM Netezza is ideal for use as a test and development system for high-performance BI applications.
Value
- As a commodity based appliance, IBM Netezza is a very affordable analytic option, delivering up to 10 TB of user data capacity in a compact physical and environmental footprint. The IBM Netezza appliance requires minimal ongoing administration, both in internal resources as well as implementation costs, for an overall low cost of ownership.There are no hidden costs.
Inside the IBM Netezza
- The IBM Netezza appliance is built using commodity blade servers and storage, turbocharged by FPGAs that filter out extraneous data as it streams off the disk. Each appliance contains a Snippet Blade™ (or S-Blade™), which is responsible for processing SQL queries in parallel across 8 pairs of Intel CPU cores and FPGA cores. Skimmer packs this power in a compact 7 rack-unit chassis, while still offering up to 10 TB of user data capacity.
IBM Netezza Analytics
- Analytics is an embedded, purpose-built, advanced analytics platform — delivered with every IBM Netezza appliance — that empowers analytic enterprises to meet and exceed their business demands.
- -Predict with more accuracy
- -Deliver predictions faster
- -Respond rapidly to changes
- IBM Netezza Analytics’ advanced technology fuses data warehousing and in-database analytics into a scalable, high-performance, massively parallel advanced analytic platform that is designed to crunch through petascale data volumes. This allows users to ask questions of the data that could not have been contemplated on other architectures. IBM Netezza Analytics is designed to quickly and effectively provide better and faster answers to the most sophisticated business questions.
- IBM Netezza Analytics is IBM Netezza’s most powerful advanced analytics platform that provides the technology infrastructure to support enterprise deployment of in-database analytics. The analytics platform allows integration of its robust set of built-in analytics with leading analytic tools from such vendors as Revolution Analytics, SAS, IBM SPSS, Fuzzy Logix, and Zementis, on IBM Netezza’s core data warehouse appliances.
- IBM Netezza pioneered the modern data warehouse appliance and has customers worldwide that have realized the value of combining data warehousing and analytics into a single, high- performance integrated system. IBM Netezza Analytics enables analytic enterprises to realize significant business value from new business models and helps companies realize both top-line revenue growth and bottom- line cost savings.
IBM Netezza Analytics capabilities
- Data exploration and discovery
- Data transformation
- Model building
- Model diagnostics
- Model scoring
IBM Netezza Architecture
- The IBM Netezza data warehouse appliance — a powerful parallel computing platform — is fully exploited by IBM Netezza Analytics to deliver high-speed, scalable analytics processing. The appliance uses the high-speed throughput of the Asymmetric Massively Parallel Processing (AMPP) architecture to maximize speed and efficiency for in-database analytics processing.
- The AMPP architecture is a blade-based streaming architecture that uses commodity blades and storage, combined with IBM Netezza’s patented data filtering using field programmable gate arrays (FPGAs), to deliver large data, high speed analytics. IBM Netezza has consolidated all analytics activity in a powerful and simple appliance.
- IBM Netezza Analytics is purpose-built to simplify the building and deploying of models for analytic enterprises that demand the highest performance on large, complex volumes of data.
Easy to use
- The IBM Netezza data warehouse appliance is easy-to-use and dramatically accelerates the entire analytic process. The programming interfaces and parallelization options make it straightforward to move a majority of analytics inside the appliance, regardless of whether they are being performed using tools from such vendors as IBM SPSS, SAS, or Revolution Analytics, or written in languages such as Java, Lua, Perl, Python, R or Fortran. Additionally, IBM Netezza data warehouse appliances are delivered with a built-in library of parallelized analytic functions, purpose-built for large data volumes, to kick-start and accelerate any analytic application development and deployment.
- The simplicity and ease of development is what truly sets IBM Netezza apart. It is the first appliance of its kind – packing the power and scalability of hundreds of processing cores in an architecture ideally suited for parallel analytics. Instead of a fragmented analytics infrastructure with multiple systems where data is replicated, IBM Netezza Analytics consolidates all analytics activity in a powerful appliance. It is easy to deploy and requires minimal ongoing administration, for an overall low total cost of ownership.
- Simplifying the process of exploring, calculating, modeling and scoring data are key drivers for successful adoption of analytics company wide. With IBM Netezza, business users can run their own analytics in near real time, which helps analytics-backed, data-driven decisions to become pervasive throughout an enterprise.
Netezza SQL functional categories
All SQL commands belong to one of the following functional categories:
- Data Definition Language (DDL)
- Data Control Language (DCL)
- Data Manipulation Language (DML)
- Transaction Control
- Miscellaneous commands
Data Definition Language (DDL)
- Use the IBM Netezza SQL Data Definition Language (DDL) to define, modify, and delete databases objects, such as databases, tables, and views.
Data Control Language (DCL)
- As database security administrator, you use Data Control Language (DCL) SQL commands to control user access to database objects and their contents.
Data Manipulation Language (DML)
- Use Data Manipulation Language (DML) of SQL to access and modify database data by using the select, update, insert, delete, truncate, begin, commit, and rollback commands.
Transaction Control
- Transaction control enforces database integrity by ensuring that batches of SQL operations run completely or not at all. The transaction control commands are BEGIN, COMMIT, and ROLLBACK.
Functions and operators
- IBM Netezza SQL provides many functions and operators. Functions are operations that take a value, whereas operators are symbols.
- In many cases, you can use functions and operations to do the same task, so the difference is commonly with the syntax.
Netezza SQL supports the following types of functions:
- Numeric
- Performs mathematical operations on numeric data
- Text
- Manipulates strings of text
- Date and time
- Manipulates date and time values and extracts specific components from these values
- System
- Returns information specific to the RDBMS being used
- Fuzzy search and phonetic matching
- Provides approximate string matching that is based on defined techniques or algorithms.
- User-defined
Performs actions that are defined by the function developer
IBM Data Warehousing and Analytics Solutions
- IBM provides the broadest and most comprehensive portfolio of data warehousing, information management and business analytic software, hardware and solutions to help customers maximize the value of their information assets and discover new insights to make better and faster decisions and optimize their business outcomes.
Netezza Architecture – Hosts
- The Netezza hosts are high-performance Linux servers that are set up in an active-passive mode for high availability. In case of active server failure, the passive host will take over the processing tasks. It just requires very small time to passive node to take over.
- The active host is an interface to external tools and client applications such BI, ETL, JDBC, ODBC tools. Client submits SQL requests via ODBC/JDBC. Number of tools such as Aginity, Squirrel, nzsql utility are used to submit SQL query to Netezza host. The Netezza compiles them into executable code segments called snippets (usually C/C++ codes) , and creates optimised query plans by distributing the snippets across to all the nodes for execution. FPGA fetches the required data and snippet execution takes place.
Field Programmable Ggate Arrays – FPGA
- The FPGA is a Netezza proprietary hardware tool developed to filters out unwanted data as early as possible when SQL query is submitted to hosts. The data will be eliminated as early as when reading from disks. This process of data elimination removes IO bottlenecks and frees up downstream components such as the CPU, memory and network from processing extra data hence notably improves performance.
- The FPGA always rely on the zone maps to eliminate the unwanted data. Zone maps are created to every column in the tables during certain Netezza operations.
Snippet Blades (S-Blades)
- S-Blades are intelligent processing nodes that make up the MPP engine of the Netezza data warehouse appliance. Each S-Blade is an independent server that contains powerful multi-core CPUs, multi-engine FPGAs and gigabytes of RAM, all working in parallel to deliver high performance. FPGA in each s-blade is important Netezza architecture hardware that improves the performance.
Disk Enclosure
- Finally, other important Netezza architecture hardware is high performance Disks. The disk enclosures contain high density and high performance disks that are RAID protected. Each disk contains a slice of the data in a database tables. Either hash or random algorithm will be used by host to distributes the data across all the disks evenly. A mirror copy of each slice of data is maintained on a different disk drive if the mirroring is enabled.
- The disk enclosures are connected to the S-Blades via high-speed interconnects that allow all the disks simultaneously stream data to the S-Blades at the maximum rate possible. The data distribution and the storage is based on the distribution key which we use while creating table.
Netezza data warehouse
- Most of Business Intelligence solutions vendors develop their platforms until they reach unacceptable size and become therefore impractical to use every day. Netezza is different, then – by providing users with data warehouse appliances building capabilities, Netezza’s specialists leave it up to customer’s will how their appliances are going to look like and what they’re going to be used for.
- As a result, the time needed for every operation becomes shortened – instead of extracting data from the largest data warehouses covering whole company’s needs, there is a separate data warehouse built for each problem that needs to be solved, while the whole system is being managed from within the Netezza TwinFin 4 platform.
Netezza TwinFin performance
- Untypical architecture of Netezza TwinFin solution made it possible to improve performance even up to 100 times. What’s responsible for such a pace is easily manageable system which integrates three elements – storage, server, and the database. What’s more, there is enough attention paid to built-in data compliance and sensitive data security (Netezza is the first company to use these two in common appliances). Finally, TwinFin’s company administrators get an insight to who and what for is accessing the data.
- What simplifies the implementation is the fact that all the hardware, software, and storage appliances are pre-configured. Thereupon, the solution is ready to be used as soon as it’s switched on. The “ready to go” idea allows TwinFin’s users to begin data loading and query executing immediately.
TwinFin simplicity
- Performance is being followed with simplicity – all analytic activities are consolidated where the data is stored. The i-Class technology supports using differentiated tools (SAS, R, Java, Python, Fortran) by letting them work simultaneously with engines and libraries.
Netezza key features
The ones below are chosen from the most meaningful TwinFin’s features:
- supporting both, Business Intelligence and advanced analytics
- scalable (10-100x) performance at petascale
- efficient even if being used by thousands of users at the same time
- i-Class technology use for analytic developing
- streaming architecture based on blades
- ubiquitous simplicity of deployment and management
- data compliant
- compatible with the most popular Business Intelligence and analytic tools
- standard SQL, ODBC, JDBC, and OLE DB interfaces
- reliability and availability at 99,99% uptime level
- green orientation thank to low cooling and power requirements
- high load pace – over 2 TB of data per hour
- high backup creating pace – over 4 TB of data per hour
Netezza data warehouse resources
http://www.netezza.com/testdrive/ –
- even the most detailed description couldn’t tell as much as experiencing the solution in practice. Thereupon, here is where you can order a trial – all you need is to put a few details about your company and wait for Netezza’s specialists to prepare a dedicated presentation for you.
http://www.techrepublic.com/whitepapers/new-architecture-and-speed-with-a-netezza-data-warehouse-appliance/386436 –
- there’s a complete article about Netezza’s Data Warehouse appliances accessible. However, it is needed to get registered first, nonetheless, the whole registration is free and enables receiving portal’s white paper newsletter, as well. The paper itself focuses mostly on how companies can struggle with large amounts of data using traditional solutions, and how with a help of Netezza DW appliances.
http://www.digident-solutions.com/resource-center/Architectural_Comparison.pdf –
- if there’s anything about Netezza Data Warehouse Appliances architecture you wanted to know, it’s mentioned in this article. The full compendium published by Netezza specialists is an answer for all doubts of people hesitating which solution to choose. All architectural aspects have got covered, nonetheless it rather is a source for more advanced users.
Connecting to Netezza Server From Python Sample
Check out my Ipython Jupyter Notebook with Python Sample.
Step 1:
- import jaydebeapi
Step 2: Setting Database connection settings:
- dsn_database = “avkash”
- dsn_hostname = “172.16.181.131”
- dsn_port = “5480”
- dsn_uid = “admin”
- dsn_pwd = “password”
- jdbc_driver_name = “org.netezza.Driver”
- jdbc_driver_loc = “/Users/avkashchauhan/learn/customers/netezza/nzjdbc3.jar”
- ###jdbc:netezza://” + server + “/” + dbName ;
- cnnection_string=’jdbc:netezza://’+dsn_hostname+’:’+dsn_port+’/’+dsn_database
- url = ‘{0}:user={1};password={2}’.format(connection_string, dsn_uid, dsn_pwd)
- print(“URL: ” + url)
- print(“Connection String: ” + connection_string)
Step 3: Creating Database Connection:
- conn = jaydebeapi.connect(“org.netezza.Driver”, connection_string, {‘user’: dsn_uid, ‘password’: dsn_pwd},
- jars = “/Users/avkashchauhan/learn/customers/netezza/nzjdbc3.jar”)
- curs = conn.cursor()
Step 4: Processing SQL Query:
- curs.execute(“select * from allusers”)
- result = curs.fetchall()
- print(“Total records: ” + str(len(result)))
- print(result[0])
Step 5: Printing all records:
- for i in range(len(result)):
- print(result[i])
Step 6: Closing all connections:
- curs.close()
- conn.close()
Connecting Netezza Server With Java Code Sample
Step 1: Have the Netezza driver as nzjdbc3.jar in a folder.
Step 2: Create netezzaJdbcMain.java as below in the same folder where nzjdbc3.jar is placed.
- import java.sql.Connection;
- import java.sql.DriverManager;
- import java.sql.ResultSet;
- import java.sql.SQLException;
- import java.sql.Statement;
- public class netezzaJdbcMain {
- public static void main(String[] args) {
- String server = “x.x.x.x”;
- String port = “5480”;
- String dbName = “_db_name_”;
- String url = “jdbc:netezza://” + server + “/” + dbName ;
- String user = “admin”;
- String pwd = “password”;
- String schema = “db_schema”;
- Connection conn = null;
- Statement st = null;
- ResultSet rs = null;
- try {
- Class.forName(“org.netezza.Driver”);
- System.out.println(” Connecting … “);
- conn = DriverManager.getConnection(url, user, pwd);
- System.out.println(” Connected “+conn);
- String sql = “select * from allusers”;
- st = conn.createStatement();
- rs = st.executeQuery(sql);
- System.out.println(“Printing result…”);
- int i = 0;
- while (rs.next()) {
- String userName = rs.getString(“name”)
- int year = rs.getInt(“age”);
- System.out.println(“User: ” + userName +
- “, age is: ” + year);
- i++;
- }
- if (i==0){
- System.out.println(” No data found”);
- }
- } catch (Exception e) {
- e.printStackTrace();
- } finally {
- try {
- if( rs != null)
- rs.close();
- if( st!= null)
- st.close();
- if( conn != null)
- conn.close();
- } catch (SQLException e1) {
- e1.printStackTrace();
- }
- }
- }
Step 3: Compile the code as below:
$ javac -cp nzjdbc3.jar -J-Xmx2g -J-XX:MaxPermSize=128m netezzaJdbcMin.java
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
Note: You should see your main class is compiled without any problem.
Step 4: Run the compiled class as below:
- $ java -cp .:nzjdbc3.jar netezzaJdbcMain
- Connecting …
- Connected org.netezza.sql.NzConnection@3feba861
- Printing result…
- User: John , age is: 30
- User: Jason , age is: 26
- User: Jim , age is: 20
- User: Kyle , age is: 21
- User: Kim , age is: 27
Note: You will see results like above.
Conclution:
- IBM provides the broadest and most comprehensive portfolio of data warehousing, information management and business analytic software, hardware and solutions to help customers maximize the value of their information assets and discover new insights to make better and faster decisions and optimize their business outcomes.