SAPy Trees

Monday, August 10, 2015

EMPOWERING SAS® USERS ON THE SAP HANA PLATFORM

Christoph Morgan from SAP provides this excellent resource for SAS developers working with SAP HANA.

http://support.sas.com/resources/papers/proceedings14/2445-2014.pdf

Wednesday, February 4, 2015

SAP S/4HANA

SAP has announced it is launching a new version of its ERP software, the biggest launch in the company’s history. And it runs on Cloud HANA with Fiori.

http://www.computerdealernews.com/news/sap-re-imagines-its-business-suite-with-s4-hana/39665

http://www.zdnet.com/article/sap-revamps-business-suite-with-new-ui-hana-analytics/

http://www.theregister.co.uk/2015/02/03/sap_hana_bet_biggest_thing_20_years/

At first I thought S/4 was some sort of partnership with HANA on Amazon S3, or perhaps an Audi on HANA. Unfortunate naming, though like HANA (upper-case it!) the name or acronym doesn’t really matter.

What does matter is the fact that this offering cements the fact that there are two code branches for SAP ERP, Cloud-based HANA and Server-based Oracle-and-every-other-db-except-HANA. I wonder which one is going to get attention within SAP?

According to The Register article above, HANA is “only” deployed to 1.4% of SAP customers as of August 2014. Of course, they have 253,000 customers or so….

In 2014, SAP will focus on helping its customers “simplify everything, so they can do anything.” (on HANA)

I have a strong feeling 2015 SAP TechEd is going to be all about the ERPers and S/4HANA, just like it was all about HANA and Fiori a couple years ago.

S/4 is built on the SAP Fiori Design which includes those Metro live tiles found in Windows 8 with a couple hundred other functional controls. The power of Fiori is its consistency and responsive design when dealing with mobile apps.

DJ Adams has a good writeup about the various SAP UI flavours. This should give a bit of a (nice?) jolt to someone who has been using the legacy SAP UI.

And HANA can run an entire 700GB enterprise dataset on a 16GB iPhone. Recently I ported a table from SQL to HANA. HANA mushed up this 700MB 98% redundant wide SQL text dataset down to about 70MB which seems pretty efficient to me. Plus, look ma, no indexes!

The real question is, will SAP S/4HANA run on Raspberry PI for free?

Alternate States–Python and Anaconda

While reading a great post about http://www.dataschool.io/teaching-data-science/ I came again upon the Continuum Analytics Anaconda of Python.

Once you start working with Python, some of the flaws become apparent, especially with Windows environments. Python Package Management Sucks. Like Linux, there are a few ways to install modules (add-ons, libraries, or packages). PIP and easy_install are the most common outside of just using python setup.py install. These will sometimes break if you don’t have a compiler or if they have external dependencies. It can become a bit frustrating, especially if you are just trying out some simple code and the first import fails, and you think a command line is something out of the military.

Python needs an Android-style app store…

For training and data science, Anaconda has a good distribution of Python with many of the commonly used libraries. It uses yet another package manager, Conda. This manager is supposed to fix some of the flaws of the other package managers by providing a better way to get some of the dependencies outside of the primary packages.

Typical syntax is to go to an Anaconda Command Prompt outside the IDE interface and run the following.

cd scripts
conda install <app>

The [Unofficial Windows Binaries for Python Extension Packages] is probably the most useful site if you’re on a Windows platform and can’t install libraries that are not compiled.

Another frustrating thing about Python which is quite similar to R is the version hell. The latest version of Python is 3.4. The default version of Python in many environments is 2.7. Many libraries stick with the default version. The simple print “hello world” command has changed between version 2.7 (2010) and 3.4 (2014). print(“omg this is kind of batty”).

For the true immersive command-line experience on Windows, either setup a Linux box in the cloud or VM or install Cygwin tools and add the directory to your System – Advanced Settings – Environment Variables – Path.

Python does have a time travel feature which is nice, though it doesn’t seem to work well on some older versions.

http://stackoverflow.com/questions/388069/python-graceful-future-feature-future-import

from __future__ import print_function

or

import __future__, sys
if "print_function" in __future__: 
    # Could also check sys.version_info >= __future__. print_function.optional
    import app
    app.main()
else:
    print "download Anaconda 3.4"

Monday, January 26, 2015

Automating SAP HANA with Python

Basic Tutorial for using Python scripts and CLI to access HANA.

http://saphanatutorial.com/sap-hana-and-python/

with a gentle introduction to Python courtesy of MIT.

http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-189-a-gentle-introduction-to-programming-using-python-january-iap-2008/

For those Microsoft-centric developers (like me) there is IronPython.
http://ironpython.net/
and IronPython Tools for Visual Studio.
http://ironpython.net/tools/

Some DB integration testing strategies and Parallel Processing of Tests
https://julien.danjou.info/blog/2014/db-integration-testing-strategies-python

Some approaches for data-driven tests.
http://sqa.stackexchange.com/questions/6678/what-are-some-good-approaches-to-separating-test-data-from-test-scripts

Parallel Testing with Python, Ruby, Node.js, R and SAP HANA

Julien Danjou, author of "The Hackers Guide To Python" gave me a wonderful idea. What if your in-memory database platform could handle a number of high-load concurrent queries? Perhaps you have SAP HANA and would like to perform some tests to ensure your analytic and calculation views are functioning as expected?

What if you could bombard your database with your entire suite of tests with a single parallel call?
https://julien.danjou.info/blog/2014/db-integration-testing-strategies-python

Maybe not the best way to become friends with your network administrator, this could showcase some of the awesome performance that is in SAP HANA.

Since the HANA ODBC drivers come with Python installed by default, leveraging Python for automation and unit testing only seems to make sense to me.

May as well host it in a Bottle.
http://bottlepy.org/docs/dev/

Sniff out your tests with Nose
http://nose.readthedocs.org/en/latest/
http://nose.readthedocs.org/en/latest/testing.html

Python Testing Taxonomy
https://wiki.python.org/moin/PythonTestingToolsTaxonomy

There is supposed to be support for SAP HANA in a flavour of SQLAlchemy - I couldn't find it.
http://www.sqlalchemy.org/

Getting started with HANA and Python.
http://saphanatutorial.com/sap-hana-and-python/
http://scn.sap.com/community/developer-center/hana/blog/2012/11/29/sqlalchemy-and-sap-hana

Comparing 2 CSV files
http://stackoverflow.com/questions/24556970/python-compare-two-csv-files-line-by-line

Exporting highly-formatted XLSX files with formulas and experimental macros
https://xlsxwriter.readthedocs.org/

Comparing those Excel files in Panda
https://xlsxwriter.readthedocs.org/

Parallel scenario testing
https://launchpad.net/testscenarios

Examples of HANA testing using LoadRunner
http://www.slideshare.net/SAPSolutionExtensions/testing-sap-hana-with-sap-loadrunner-by-hp

Examples of t-SQL tests that could be translated to SAP HANA, perhaps with the T-SQL to SQLScript translator
https://www.simple-talk.com/sql/t-sql-programming/sql-server-unit-testing-with-tsqlt/
http://www.codeproject.com/Articles/841250/Create-SQL-Server-Database-Unit-Tests
http://stackoverflow.com/questions/754527/best-way-to-test-sql-queries
https://msdn.microsoft.com/en-us/library/jj851212(v=vs.103).aspx
http://tsqlt.org/user-guide/tsqlt-tutorial/

IronPython & ODBC
http://www.ironpython.info/index.php?title=Databases_with_Odbc

pyodbc
http://www.easysoft.com/developer/languages/python/pyodbc.html

FitNesse & decision tables
http://fitnesse.org/FitNesse.UserGuide.TwoMinuteExample

ODBCTrace/SQLDBTrace
https://websmp130.sap-ag.de/sap/support/notes/1993254
http://service.sap.com/sap/support/notes/1993251

Save the results to Confluence Wiki
https://marketplace.atlassian.com/plugins/com.atlassian.labs.rest-api-browser
http://mattryall.net/blog/2008/06/confluence-python
https://ecosystem.atlassian.net/wiki/display/BLOG/XML-RPC+Page+Updater+Example

Blog your results
https://ecosystem.atlassian.net/wiki/display/BLOG/BlogginRPC+Plugin+Python+Scripts

If you don't want to go down the path of using Python for your unit tests, why not Ruby?
https://prograils.com/posts/getting-your-rails-app-running-on-the-sap-hana-cloud-platform
http://flavio.castelli.name/2010/05/28/rails_execute_single_test/

Or Node.js?
https://github.com/SAP/node-hdb

Test with Alpaca
http://scn.sap.com/community/developer-center/hana/blog/2014/09/11/alpaca--unit-test-over-hana

Or wait for HANA SP09 with Mockstar
http://scn.sap.com/community/developer-center/hana/blog/2014/12/09/sap-hana-sps-09-new-developer-features-hana-test-tools
http://mockstar.readthedocs.org/en/latest/

SAP HANA System Views & SQL Reference
https://help.sap.com/saphelp_hanaplatform/helpdata/en/b4/b0eec1968f41a099c828a4a6c8ca0f/content.htm?current_toc=/en/2e/1ef8b4f4554739959886e55d4c127b/plain.htm&show_children=true

Learn more with Shine
http://help.sap.com/hana/sap_hana_interactive_education_shine_en.pdf

Perhaps it makes sense to expose your HANA views as OData services and test those instead?

OData & Testing OData
http://scn.sap.com/people/lucas.sparvieri/blog
http://scn.sap.com/community/gateway/blog/2013/11/27/ecatt-based-test-automation-for-odata-services-available

http://scn.sap.com/community/developer-center/hana/blog/2012/12/21/hana-development-xs-odata-services

http://www.asp.net/web-api/overview/testing-and-debugging/unit-testing-with-aspnet-web-api
http://www.asp.net/web-api/overview/odata-support-in-aspnet-web-api/odata-v4/create-an-odata-v4-client-app

Getting a little bit wackier with the possibility of using the HANA R integration to compare 2 data frames.

http://scn.sap.com/community/developer-center/hana/blog/2012/05/21/when-sap-hana-met-r--first-kiss

Comparing 2 resultsets in R.
http://stackoverflow.com/questions/3171426/compare-two-data-frames-to-find-the-rows-in-data-frame-1-that-are-not-present-in

http://www.cookbook-r.com/Manipulating_data/Comparing_data_frames/
http://cran.r-project.org/web/packages/compare/compare.pdf
http://www.r-bloggers.com/identifying-records-in-data-frame-a-that-are-not-contained-in-data-frame-b-%E2%80%93-a-comparison/

http://www.johnmyleswhite.com/notebook/2010/08/17/unit-testing-in-r-the-bare-minimum/

Tuesday, October 28, 2014

Create an ODATA service with HANA and R

Interesting example of wrapping R up into an exposed ODATA layer out of SAP HANA.
http://scn.sap.com/community/developer-center/hana/blog/2013/10/08/creating-an-odata-service-using-r

How about using SQL in R? Lubridate'ing? Random Forests?
http://blog.yhathq.com/posts/10-R-packages-I-wish-I-knew-about-earlier.html

Exporting to Excel? Plus installing a bunch of other packages in a single shot?
https://gist.github.com/bearloga/10988512

Wednesday, June 11, 2014

Boasting about Oracle's In-Memory Database

Oracle is trying to halt some of the migrations from Oracle to SAP HANA with their new Oracle 12c In Memory technology. Basically it allows you to "Pin" tables in memory in a columnstore cache.

Some speed boasts:

Database queries and analytics running between 100 and 1,000 times faster than in the past.

With in-memory technology, Oracle 12c database allows each CPU core to scan 2.5 billion rows per second.

The time it takes for the 12c database to process 10 million invoice lines has been shrunk from 244 minutes to 4 seconds.

The time it takes to run a financial analysis program is cut from about four hours to roughly 12 seconds

A system for keeping track of a company’s transportation network featuring 16,000 drivers and 60 million shipment data records, is slashed to under a second from 16 minutes.

A process that had previously taken 58 hours now needs only 13 minutes.

Welcome to a world where disks are a thing of the past. Pretty soon I predict that physical disks will go the way of tape drives, and we'll all be running with 2-4 terabytes of RAM.

http://diginomica.com/2014/06/11/larry-ellison-database-fast-customers-broken/?utm_source=dlvr.it&utm_medium=twitter&utm_campaign=Feed%3A+diginomica2+%28diginomica%29