Posts Tagged ‘oracle’
GNU basename in PL/SQL
Reposted from The Pythian Group blog.
In the process of scripting a database migration, I was in need of something akin to the GNU basename utility that I know and love on Linux. basename is most famous for taking a full file path string and stripping away the leading path component, returning just the name of the file. This can be emulated in PL/SQL with calls to SUBSTR and INSTR, like this:
substr(dirname,instr(dirname,'/',-1)+1)
(Thanks to Ian Cary, who shared this logic on oracle-l)
As you can see, this simply finds the last occurence of /, which is our directory separator on *nix and Solaris operating systems. On Windows, it would be \. It then returns a substring beginning one character after that last separator until the end of the string. Voila, a basic basename routine!
Upon reading the basename man page again, I found that basename also takes an optional parameter, a suffix string. If this suffix string is provided, basename will also truncate that string from the end. For example:
$ basename /home/seiler/bookmarks.html bookmarks.html $ basename /home/seiler/bookmarks.html .html bookmarks
I decided that this would be handy to have, and set out to create a compatible basename function in PL/SQL. Here is what I came up with:
CREATE OR REPLACE FUNCTION basename (v_full_path IN VARCHAR2,
v_suffix IN VARCHAR2 DEFAULT NULL,
v_separator IN CHAR DEFAULT '/')
RETURN VARCHAR2
IS
v_basename VARCHAR2(256);
BEGIN
v_basename := SUBSTR(v_full_path, INSTR(v_full_path,v_separator,-1)+1);
IF v_suffix IS NOT NULL THEN
v_basename := SUBSTR(v_basename, 1, INSTR(v_basename, v_suffix, -1)-1);
END IF;
RETURN v_basename;
END;
/
I’ve also added an optional third parameter to specify a directory separator other than the default. It would probably be rarely useful, but not hard to remove if you don’t like it. As you can see, I’ve used similar SUBSTR/INSTR logic to identify the suffix index and prune it out.
Here it is in action:
SQL> COLUMN file_name FORMAT a45; SQL> COLUMN basename FORMAT a15; SQL> COLUMN no_suffix FORMAT a12; SQL> SELECT file_name 2 , basename(file_name) as basename 3 , basename(file_name, '.dbf') as no_suffix 4 FROM dba_data_files; FILE_NAME BASENAME NO_SUFFIX --------------------------------------------- --------------- ------------ /u01/app/oracle/oradata/orcl/users01.dbf users01.dbf users01 /u01/app/oracle/oradata/orcl/sysaux01.dbf sysaux01.dbf sysaux01 /u01/app/oracle/oradata/orcl/undotbs01.dbf undotbs01.dbf undotbs01 /u01/app/oracle/oradata/orcl/system01.dbf system01.dbf system01 /u01/app/oracle/oradata/orcl/example01.dbf example01.dbf example01
I hope this makes your work just a little bit easier, as it has mine.
Does Oracle’s Block Change Tracking File Shrink?
Reposted from The Pythian Group blog.
Just a quick post to get myself back into blogging mode. Recently in IRC (#oracle on freenode, to be precise), a fresh face asked if the Block Change Tracking file ever shrinks. She had been worrying about the file in her instance continuing to grow. A number of us speculated (non-BAAG!) that perhaps taking an RMAN backup would somehow purge the file of what it was keeping track of, and then the magical Oracle fairies would promptly resize it for us. Needless to say, I was hesitant to take this theory forward with Alex Gorbachev aware of my home address.
After setting up Oracle 10.2.0.1 on a nice VirtualBox image
(more on that in another post) running CentOS 5, I began to do some reading. For some reason, actually reading the official tahiti docs was last on my list. A search of the 10gR2 docs quickly yielded this (from RMAN Incremental Backups):
4.4.4.4 Estimating Size of the Change Tracking File on Disk
The size of the change tracking file is proportional to the size of the database and the number of enabled threads of redo. The size is not related to the frequency of updates to the database. Typically, the space required for block change tracking is approximately 1/30,000 the size of the data blocks to be tracked. Note, however, the following two factors that may cause the file to be larger than this estimate suggests:
- To avoid overhead of allocating space as your database grows, the change tracking file size starts at 10MB, and new space is allocated in 10MB incremenents [sic]. Thus, for any database up to approximately 300GB the file size is no smaller than 10MB, for up to approximately 600GB the file size is no smaller than 20MB, and so on.
- For each datafile, a minimum of 320K of space is allocated in the change tracking file, regardless of the size of the file. Thus, if you have a large number of relatively small datafiles, the change tracking file is larger than for databases with a smaller number of larger datafiles containing the same data.
So (if the docs are to be trusted), it would seem that whether or not a backup is taken has no effect on the size of the file, or at least wouldn’t cause it to be shrunk. The size is tied to the amount of data in the database itself, not necessarily the changes in the database waiting to be included in the next incremental RMAN backup.
The documentation does suggest, however, that file size might be affected if (for example) a tablespace and its datafiles were dropped from the database. I’ll save this test for another day!
FLASHBACK TABLE vs. DBA_OBJECTS . LAST_DDL_TIME
NOTE: This post originally appeared on The Pythian Group blog on 6 June 2008, and is reposted here with permission.
A little over a week ago, a teammate and I were trying to use Oracle’s FLASHBACK TABLE to undo an “oops” UPDATE statement that a client’s developers had run on one of their test databases, clearing data from two columns in all rows of the table. The statement was actually part of a script that also contained ALTER TABLE statements to add columns. This is important to note because FLASHBACK TABLE will only let you go back as far as the most recent DDL against that table. To quote the SQL reference, “Oracle Database cannot restore a table to an earlier state across any DDL operations that change the structure of the table.”
die Updates
Just dropping a line to let everyone know that the large part of my tech blogging will now be at the Pythian Group blog. I’ll keep this blog space open and post links to my new posts over there. I’m still searching for the balance of what will be strictly over there vs. what content will be just here, so please bear with me.
Bind Variables and Parallel Queries Do Not Mix
This post was promised long ago, and I apologize for the tardiness. Some of you may recall my whining about seemingly unexplainable instance hanging since migrating our database to 64-bit hardware in September. Well, after some back and forth and hand-offs from one rep to another, we finally were given a possible explanation: Bug 4367986. The summary of the bug is “bind peeked parallel cursors do not share.” This basically means that parallel queries that use bind variables won’t share cursors. Not only does this defeat the purpose of using the bind variables, but it creates a new cursor for each parallel process. After a while, your cursor count will go up, just as mine did:
select sql_id, count(*) from v$sql_shared_cursor where bind_peeked_pq_mismatch='Y' group by sql_id; SQL_ID COUNT(*) ------------- ---------- f3u64ru922snx 520 ckha07wkfaf8v 5 9g26upcqjh8kp 1 gdnga6d26vf4g 15
While I wasn’t able to choke and hang the instance in development, I was able to drive the count up as we saw in production.
We probably didn’t see this before our x86_64 migration because on our 32-bit instance, our parallel_max_servers was only set to 16. After migrating to the new hardware it was raised to 80 based on “the formula”. Dropping it to 0 obviously prevented the problem from coming up as well.
The bug is reportedly fixed in 10.2.0.4, which wasn’t released at the time we were finishing up the SR. There was only a one-off patch for 10.2.0.3, meaning we had to upgrade from 10.2.0.2. Well we did this, applied the patch, and haven’t had a reoccurrence of the problem since. That query listed above now happily returns no rows.
Of course there was also the issue of using bind variables in queries against partitioned tables. Greg Rahn had this to say:
Using PQ with binds can have other adverse effects, specifically if
the partition key is not provided as a literal. When the partition
key is a bind, the resulting plan will be a KEY-KEY plan (for
pstart/pstop) because w/o a literal value the optimizer can not tell
if there is any partition elimination since the literal value is not
provided at parse time. This often times results in a “wost case”
assumption, thus is it possible to have different plans even when the
bind and literal statements use the same values.I would speculate that the overhead of parsing literals when using PQ
is minimal compared to the side effects it is causing (due to the bug)
and the potential of suboptimal plans. I personally would never mix
the two.
ALTER TABLE doesn’t like synonyms
Something new I found out today. When performing an ALTER TABLE statement (in this case to add a column) in another schema for which you have a synonym (public or private), you must refer to it with the schema qualifier, e.g. HR.EMPLOYEES rather than just EMPLOYEES. It seems the DDL statement doesn’t bother itself with synonyms. Probably a good thing, as DDL is nothing to be taken lightly and it’s best to fully spell out what it is you are mucking about with.
In the example, my user has been granted the ALTER ANY TABLE privilege, as well as SELECT on HR.EMPLOYEES. I’ve made a private synonym in my own schema, also named EMPLOYEES. Let’s see what happens:
SQL> alter table employees add foo number; alter table employees add foo number * ERROR at line 1: ORA-00942: table or view does not exist SQL> alter table hr.employees add foo number; Table altered.
Meet My DataGuard Sandbox
Say hello to my meager setup to learn and test DataGuard (maybe RAC someday too). BULGY and BULSTRODE, a couple of old Gateway E-series towers (each 2.80 Ghz, Pentium 4 CPU, 2 GB RAM), running CentOS 5 and (soon to have) Oracle 10.2.0.4.
Click for an equally unimpressive larger version.
BULGY is my old workstation-turned-sandbox for Oracle. I had been wanting to play with RAC, even though it’s probably overkill for us now. But we are looking to set up a DataGuard standby, so this would be a better fitting use of the hardware. I had been bookmarking some of the RAC-via-VMWare guides, but then I saw another black E-series sitting on the shelf in the MIS corner of my office space. I gave one of the techs my sad puppy-dog eyes and he hooked me up and even bumped up the RAM to match BULGY. I christened it BULSTRODE in keeping with our naming convention of Thomas the Tank Engine characters.
v$sql_bind_capture Not Quite Capturing SQL Binds
A week or so ago, I was working with a developer to find out why a SQL statement he was sending would sometimes work and sometimes return an invalid DATE format error. Part of the troubleshooting led us to examine the actual bind values being received by the server, just to make sure it was the same as being sent. I thought I was onto something when I saw that the fields of TIMESTAMP datatype actually were NULL:
SQL> select name, datatype_string, value_string
from v$sql_bind_capture
where sql_id='0wp5c2a3z82jr'
SQL> /
NAME DATATYPE_STRING VALUE_STRING
---- --------------- ------------
:1 TIMESTAMP {NULL}
:2 VARCHAR2(32) 3WC
:3 VARCHAR2(32) 001
:4 TIMESTAMP {NULL}
:5 NUMBER 8429721
:6 NUMBER 206
6 rows selected.
Unfortunately it was like this when the query both failed and succeeded. I banged my head against the wall for 30 more minutes before concluding that it must be a bug. Metalink agreed, and Note 444551.1 details the bug, which is not yet fixed and “should be fixed in 11.2,” affecting versions 9.2 to 11.1. Seems kind of long for something like this to still be out in the open, and it’s very annoying when you really want to see the value of a TIMESTAMP bind variable.
A Sub-Prime I/O Primer
My fans on oracle-l already know that I’ve had a bit of a battle with Oracle I/O recently, most of the damage being self-inflicted. I’d like to give as forensic a review as possible so that those poor souls who come after me will have some hope and inspiration to carry on.
First of all, some definitions, as they relate to Oracle:
Direct I/O: This is I/O done without the use of the OS (or filesystem) buffer cache. Oracle already has its own buffer cache, so the filesystem cache is (usually) makes I/O take much longer than just bypassing it and relying solely on Oracle to handle the data buffer cache, which it should do much better (for it’s own needs) than the filesystem.
Asynchronous I/O: This type of I/O means that I/O commands are sent but other processing can continue before the I/O has finished.
NOTE: These two kinds of I/O, direct and asynchronous, are completely independent of each other. For some reason that I can’t explain, I had it stuck in my Coke-addled mind that direct was the opposite of asynch I/O. As many times as I read over guides and glossaries, it wouldn’t shake loose.
And now, for our feature presentation.
Act I
In which our hero becomes bewildered with the possibilities
I had been idling in #oracle when hali mentioned something to someone about direct I/O. Having not dealt with I/O very much in my own travels, I inquired more. It was then that I learned of filesystemio_options and the benefits of setting them to “directio.”
Needless to say, I was sold.
I set about with my trusty system/storage administator on mounting our vxfs partitions with the proper parameters (mincache=direct,convosync=direct) and setting filesystemio_options=directio, which I’m led to believe is redundant. Once the filesystem is mounted with the directio options, Oracle (and anything else) will always use direct I/O.
Act II
In which direct meets asynchronous
Fast forward three or four weeks and I’m grasping at straws trying to figure out a problem that causes the production instance to hang. (I’m not going to address these hangs in this post, as I’m not really sure that they have to do with direct I/O.) There seemed to be a perfect storm of changes in the weeks prior to these problems happening
- Migrated to a 64-bit server (via datapump exp/imp)
- Much larger SGA (from 1.5 Gb to 16 Gb)
- Dramatically different datafile layout
- Using direct I/O
And that’s just to name the ones I can think of at this moment. I decided (after weeks of tinkering elsewhere) to look at direct I/O. It was during this second look that I realized that one probably should also have asynchronous I/O enabled.
Act III
Paradise Lost
And so I set filesystemio_options=setall. There were no problems upon instance restart. However this showed up in that evenings RMAN backup:
Starting Control File and SPFILE Autobackup at 29-NOV-07 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: ===========================================================RMAN-03009: failure of Control File and SPFILE Autobackup command on ORA_DISK_1 channel at 11/29/2007 02:00:46 ORA-19502: write error on file "/rman/c-3171457975-20071129-00", blockno 321 (blocksize=16384) ORA-27061: waiting for async I/Os failed Linux-x86_64 Error: 14: Bad address Additional information: -1 Additional information: 1048576
oracle:~/sr $ strace -c -p 12605 Process 12605 attached - interrupt to quit Process 12605 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 80.66 1.142811 298 3831 io_getevents 7.26 0.102820 51 2004 io_submit
Jumping to Conclusions
As I mentioned before, I’m not convinced at all that the hangings had anything to do with the I/O issues. However I must note that since we enabled asynchronous I/O on top of the direct I/O, we haven’t had any instance hanging. Where we used to get them once every other day, it has been three weeks since the last incident.
To their end, Oracle support had analyzed our system state dumps and noted that our session cached cursors was at 100% utilization, and a lot of the wait events during the hanging had to do with cursors (cursor: pin S wait on X). Per their recommendation, we have raised the session_cached_cursors parameter, but haven’t had a window yet in which to bounce the instance so the parameter can take effect.
If there ever is a point where we figure out exactly what was causing the problem and what the solution was, I’ll be sure to write about it in this space. For now I just wanted to show you what an impulsive, ignorant fool I am.
~ Fin ~
The Rumble Down Under
Looks like Richard Foote won’t be on Howard Rogers’ Christmas card list for much longer (see the comments).



