#quicktip #oracle #sqlcl logon.sql #PLSCOPE_SETTINGS and #PLSQL_WARNINGS

Quick Tip. My logon.sql file.

I don’t want plscope_settings and plsql_warnings set when I’m in production; but if I’m in my test / dev / sandbox environments, then I do turn them on.

This helps. Enjoy; this should be self explanatory.

set linesize 90
set pagesize 1000
col table_name format a35
col owner format a20
set timing on
 sInst varchar2(1);
 select upper(SUBSTR(instance_name, 1,1))
 INTO sInst
-- changed to use sys_context 2018/02/23
 select sys_context('userenv','instance_name')
 into sInst
 from dual;

-- test to see if this is a production instance
 -- all production instances start with P so ...
 -- if it's not a production instance set up
 -- session properties approiate for dev / test / sandbox.
 IF sInst != 'P' THEN
   execute immediate 'ALTER SESSION SET PLSQL_WARNINGS=' || '''ENABLE:ALL''';
define _editor=vi

-- setup aliases for sqlcl
alias tab=select table_name from user_tables;

#GDPR – RIGHT TO ACCESS. Security is a feature #3 Right to Access Part 1 of 2

The GDPR Right to access can get a bit complicated as it covers a few things that provide some challenges for us.

What is the purpose of the processing? Lets face it, we process data on people for a number of reasons.

First we are going to use the example of an online storefront. When a person places an order, a number of things happen with that person’s data. 1) Is there payment information accurate? 2) Do they have a store credit? 3) Does that person get a discount? 4) The parts that were ordered need to be shipped. 5) Store a history of the persons order, so the next time they want to order something, the system can make recommendations based on past orders.

If the shipping is being done by a third party such as Fedex or UPS, the person’s data is going to be transmitted to the shipping company. So now a person’s data is being held at both the online vendor and the shipping company.

What is the category of the data being processed? For the online storefront, we have payment, order and shipping information. For the shipping company, there is shipping and value information.

The recipients or categories of recipient to whom the personal data have been or will be disclosed, in particular recipients in third countries or international organisations. So in this case, let’s say that the package was shipped by FEDEX and let’s assume FEDEX data processing is done in the United States. (NOTE: this is an assumption, not a statement of fact.) Anyway the person’s shipping information has now left the EU for the United States of America. From the perspective of the online store vendor, the shipper will need to address Article 44 of the GDPR. I’ll get to Article 44 later.

Where possible, the envisaged period for which the personal data will be stored, or, if not possible, the criteria used to determine that period. So, how long will the data be stored? Certain types need to be kept for a defined period of time. ie. Financial information may need to kept for five or seven years that is defined by either applicable law or regulation. Other types of information may be kept for very short periods of time. I once worked on a system where the data was only resident for thirty days. This system packaged up the data and sent it to downstream system for further processing. Once the data was sent downstream, it was no longer needed. But, once the data was sent downstream, we would need to track what downstream systems received the personal data.

The existence of the right to request from the controller rectification or erasure of personal data or restriction of processing of personal data concerning the data subject or to object to such processing. You should be seeing a pattern here. It is going to be critical that we identify all PII (Personally Identifiable Information) in our systems. If a person identifies information we hold that is inaccurate, there needs to be a process in place to correct the information. If a person wants their information removed from our system, there needs a process in place to remove the information without corrupting or compromising the integrity of the system. Reference the Right to be forgotten and Article 17 of GDPR.

The right to lodge a complaint with a supervisory authority. We need to get into the definition of “Supervisory Authority.” I’ll address that in a later post. What kind of complaints can we expect? And what is the process to resolve those complaints? We need to spend time developing the process to address data complaints.

Where the personal data are not collected from the data subject, any available information as to their source. We feed downstream systems, and to be honest, there are a number of companies that sell our personal data. Say your company purchases mailing and phone list from a series of companies, you are now going to need to track the source of that data, so when the question arises, and it will; you can answer the question. What was the source of the data?

More to come.

#SQLDeveloper Unit Testing

I’ll get back to the GDPR series tomorrow.

Can I get a show of hands, How many people do unit testing? I’ve been in a lot of shops where unit testing and code reviews have been passed up in the name of Agile. First off, this is a misunderstanding of what Agile is. You do need to do Unit Test when doing Agile and code reviews are still highly recommended. Because this is not a post on Agile, that’s all I’ll say on that.

I’ve been working on a fairly complex problem for the past few months, that involves advanced analytics, and some simple test against a data set. This wound up being several thousand lines of code that had to be fast, accurate, secure and stable. Here is a shameless plug for SQL Developer, there is an awesome and flexible unit testing module built in to it. It’s quite easy to setup, and give you testing that is repeatable. Jeff Smith has a good write up on setting up and using Unit Testing. Oh, and did I mention, it’s free! Thanks guys.

So after building all my code, I was able to design some test to exercise all the procedures and functions for both successful execution and generate exceptions; so I know my code and deal with any exceptions that may come up.

If any test fail, the entire test fails. When all the test pass, than the testing has past. Really this made my job a lot easier. Identifying what code has bugs is easy, knowing what the input is, what is expected and what failed is easy.

Now that unit testing is complete, integration testing can start. Life is good.

#GDPR is coming Security is a feature #2 Data Masking

#GDPR is coming Security is a feature #2 Data Masking

In the context of GDPR, data masking should be done so that the all the data attributes on a person should not be able to define that person. Most of us know this, but we rarely articulate it; as you add more attributes to data it becomes much easier to define a person.

Use case: my full name is Robert Price Lockard, but let’s say there are a whole bunch of Roberts’ in the universe, therefore Robert does not uniquely identify me. Now let’s start adding attributes.

Let’s go further. I live in the United States of America; that narrows it down even more. Here is the frequency distribution of the top 10 names in the USA. So, I’m one of 5,497,484 Roberts’ in the US.

Rank Name Frequency
1 John 7,556,152
2 Marry 7,474,295
3 James 5,714,116
4 Robert 5,497,484
5 Michael 4,942,065
6 Christopher 4,747,669
7 William 4,665,950
8 Joseph 4,619,701
9 Elizabeth 4,270,062
10 Richard 4,109,367

I am a Pilot, okay, we’ve narrowed down the universe of Roberts’ to Roberts who are a pilot. There are still quite a few Roberts’ out there who are pilots.

Okay, so we have this big universe of Roberts’, how about we add my birth date of May 26, 1960. (Yea’ I’m pretty old). Now we are starting to narrow it down a bit, but in 1960, Robert was a very popular name and still is, so there are still quite a few Roberts’ in the population of Pilots and born May 26, 1960.

I live in the state of Maryland; we are narrowing down down the universe of Roberts’ even more. Say I purchase a lot of stuff from Aircraft Spruce and Specialty Company that would only be used for a 1948 Ryan Navion. With this information, you can likely make an educated guess to uniquely identify me.

Now let’s add in my zip code (postal code) of 21060. You just identified me. This is where things start to get tricky (or interesting depending on your point of view.)

Robert — there are a bunch

Pilot — there still are a bunch

USA — fewer, but still a bunch

Date of Birth — narrowed it down, but still not enough info to identify me.

State of Maryland — narrowing it down quite a bit, and most likely take an educated guess.

Zip code 21060 — nailed it, now you know who I am..

Now when we start thinking about masking data in the context of GDPR, we are also going to need to look at the type of business we are dealing with. If you are a bank, you may be able to mask down to zip code, this is because the vast majority of people in a zip code (postal code) have bank accounts.

But if you are a company like Aircraft Spruce and Specialty company (that gets a lot of my money) you should mask zip code and even go as far as masking the state. Why, because if you look at my purchase history from Aircraft Spruce, you would be able to determine the type of aircraft I own. And because my aircraft is pretty rare (I don’t know of any other 1948 Ryan Navions’ in the state of Maryland) it would be easy to determine who I am based on my purchase history.

Why would we mask data anyway? If you are using production to refresh lower environments; then you really need to start masking your data. This has as much to do with GDPR and your overall security profile; because guess what; when attacking a system only an amateur would go after your production system first. A hacker is coming for your DEV and TEST systems first, these environments are real noisy and it’s quite easy to hide in them for months on end. So if you are not masking your data in lower environments, then start doing it now. The other reason is under the “right to be forgotten.” It would be much easier to clean your production data of a person then to have to go through all of your lower environments to find all instances of a person to clean.


Name Frequency Distribution

#GDPR is coming Security is a feature #1

Security is feature. Over the next few weeks, I’ll be looking how some of the issues on how GDPR will be impacting us as technical professionals.

There are a lot of things that I find technically interesting in GDPR such as “Right to be forgotten.” How do we track all instances of a person we have in our systems? When we delete a person from our system, what about the artifacts in backups, and archive logs? We can’t go back and remove a person from our backups.

If a person made a transaction, and that person wants to be forgotten. We should address replacing the person to be forgotten with a generic person. Do you keep sex, age, postal code tied to the transaction details? The generic person attributes need to be addressed by your organization.

If a person downloads personal information to their workstation; we need to track that and ensure the data is scrubbed from the workstation. Therefore audit needs to configured so we know who has accessed the person’s data.