V$LOCK视图显示TX锁没有对应的TM锁的几种可能现象

v$lock视图是Oracle中经典的锁检测接口之一,该视图列出了当前实例中被持有的锁信息;其中TYPE列指出了该持有锁的类型,比较常见的TX锁即事务队列锁(Transaction enqueue)和TM锁即DML enqueue锁(俗称表锁);一般某个会话在其事务(transaction)结束前总是持有一个TX锁,并一个或多个TM ROW-X(SX)锁 (一个事务中可能对多个表或对象进行了更新). 但偶尔我们会发现某个会话持有一个TX锁,却没有对应的TM锁的情况. 第一次接触到该问题,可能会觉得有些不可思议,但这种情况却的确存在:

SQL> select sid,type,id1,id2,lmode,request,ctime,block from v$lock where type  in ('TM','TX');

       SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------- -- ---------- ---------- ---------- ---------- ---------- ----------
      2090 TM     112504          0          3          0          0          2
      2090 TM      47529          0          3          0          0          2
      2090 TM      47530          0          3          0          0          2
      2090 TM      57497          0          3          0          0          2
      2090 TM     109826          0          3          0          0          2
      1137 TM      46920          0          3          0        891          2
      2090 TM      49919          0          3          0          0          2
      1501 TX    3538968     635593          6          0     273274          2
       586 TX    8650772     560279          6          0      88978          2
      1112 TX    7929874     781313          6          0       1291          2
       970 TX    4128815     688150          6          0     238769          2
      2175 TX    2555916     738283          6          0     273630          2
      2016 TX    1966099     673442          6          0       6164          2
      1933 TX    1769516     589203          6          0     792823          2
      2206 TX   13434902     679577          6          0     237904          2
       943 TX   16384041     282250          6          0      27004          2
      1207 TX    7077931     494826          6          0     587548          2
       374 TX   14286860     599261          6          0     279997          2
      2404 TX    4325408     713583          6          0          0          2
      2145 TX    9109533     592942          6          0      26868          2
      2296 TX    1835016     631599          6          0      78593          2
      1967 TX   14876690     484837          6          0     288239          2
      3276 TX    1572869     721894          6          0        115          2
      3232 TX    4915206     704681          6          0     332428          2
      1374 TX   14417924     546699          6          0        263          2
       635 TX    3473435     799864          6          0          4          2
      1179 TX   10289168     644315          6          0       8679          2
      3034 TX   14221320     556459          6          0      23417          2
      1403 TX    6553606     666507          6          0       7949          2
      2872 TX     131116     637111          6          0     112979          2
      1316 TX    8519697     598740          6          0          5          2
      1264 TX   11534348     794295          6          0       3062          2
      1793 TX    5505026     613635          6          0        114          2
      2689 TX   10878991     511962          6          0     792824          2
       743 TX     393235     630546          6          0        721          2
      2744 TX    4784159     615312          6          0      20424          2
      1959 TX    2424878     763163          6          0     201334          2
      1468 TX    2293801     918876          6          0          1          2
      1980 TX      65555     776770          6          0       6463          2
      2038 TX    9699339     600600          6          0          1          2
      1551 TX    3080213     560800          6          0     587549          2
      2835 TX    7733272     885787          6          0        173          2
      2246 TX   11665419     657752          6          0      17988          2
      1229 TX   11010083     528844          6          0     792824          2
      1476 TX   12320806     581063          6          0     792823          2
      3254 TX   10813472     551889          6          0       9643          2
      1794 TX    3014673     556138          6          0         21          2
      1682 TX    6291465     661166          6          0      10960          2
       998 TX    3670027     717078          6          0        114          2
      2989 TX    2031638     672709          6          0       1233          2
      2090 TX   13107209     436774          6          0          1          2
      3203 TX    5373978     731263          6          0       4982          2
      1676 TX   13172747     565610          6          0     273811          2
      2096 TX    1900577     665848          6          0       6464          2
      1563 TX    6357010     556097          6          0     587549          2
      1899 TX    2621459     746534          6          0     279634          2
      1722 TX    9764875     567516          6          0      14502          2
      2907 TX    8585236     553757          6          0     225941          2
      2386 TX    6225966     675153          6          0        855          2
      2238 TX   12189735     554114          6          0       6121          2
      1118 TX   15663139     468328          6          0     237037          2
      2280 TX    8716326     593341          6          0     792824          2
      2222 TX     524309     678972          6          0          1          2
      1230 TX   15794181     512612          6          0     792823          2
      2593 TX   16711683     230810          6          0     201661          2
      2261 TX    4849676     681767          6          0      18159          2
      1069 TX   12451855     465400          6          0     587549          2
      1203 TX    8388613     602148          6          0     587549          2
      2829 TX     589835     571720          6          0       5143          2
       988 TX    3997708     645741          6          0      28958          2
      1771 TX   13631511     521392          6          0          2          2
      2601 TX    1638423     605864          6          0          5          2
      2319 TX     655374     738909          6          0          3          2
      2963 TX    2097153     637674          6          0       7640          2
      3146 TX   11599895     699476          6          0       9561          2
       377 TX    9043977     638698          6          0     181366          2
      3166 TX    2490407     904504          6          0       6688          2
       555 TX    7012358     590789          6          0     112970          2
      3226 TX    5898270     519207          6          0     792823          2
      1103 TX    9240589     672736          6          0      18184          2
      3195 TX   10158108     504545          6          0       2576          2
      2715 TX   15925274     428945          6          0     201522          2
      2558 TX    1703943     643492          6          0       3047          2
      3021 TX   12386346     494943          6          0     587549          2
      2264 TX    1441795     692086          6          0        149          2
      2416 TX    6946836     655343          6          0     261655          2
      1397 TX   12517393     588260          6          0    1010345          2
      1439 TX   16318509     299824          6          0       5011          2
       869 TX    9437186     513497          6          0          2          2
      2535 TX   15597580     456805          6          0     182254          2
      2130 TX    4521998     674031          6          0     204313          2
      2708 TX    7405570     536059          6          0       5836          2
      2259 TX   15204353     467190          6          0     792823          2
       429 TX   10485792     738387          6          0     278846          2
      2031 TX    8912896     683349          6          0     273565          2
       559 TX   10092545     566706          6          0     587548          2
      3067 TX   13369376     673321          6          0          5          2
      2290 TX   15990804     472707          6          0     792823          2
      3208 TX    5439521     627538          6          0     156833          2
       563 TX    7798805     741212          6          0      18417          2
      2556 TX    5570567     483016          6          0     587549          2
      1443 TX   13828137     590212          6          0     273603          2
      1594 TX    9633796     613365          6          0     179342          2
       500 TX    2162700     615179          6          0       3505          2
      1458 TX    2752543     636530          6          0     587549          2
      2294 TX   14811182     545859          6          0          1          2
      2647 TX     262173     631006          6          0        376          2
      2770 TX    8847365     665512          6          0         18          2
      1881 TX    6815766     505031          6          0      17989          2
      1733 TX    2228250     744841          6          0      89732          2
      3253 TX   12124195     661906          6          0       4917          2
      1171 TX    4390934     688053          6          0        952          2
       575 TX    2359297     792394          6          0        932          2
      1952 TX    7995429     778698          6          0       7377          2
      1073 TX   10420261     676462          6          0         21          2
      2976 TX   14483498     557853          6          0     332223          2
       310 TX    8126473     611980          6          0     196341          2
      2624 TX   12779564     585073          6          0     177140          2
      1400 TX    8454154     548105          6          0          1          2
      2541 TX    3342373     612450          6          0     154890          2
       882 TX   11796503     502190          6          0       5987          2
      1475 TX     327695     668278          6          0          0          2
      1694 TX    7864342     655797          6          0     587549          2
      1882 TX    6422539     647676          6          0      16739          2
      3133 TX   13697064     591601          6          0          4          2
       637 TX   12845067     554594          6          0        976          2
       809 TX   11730960     737397          6          0         13          2
      2799 TX    3604491     762060          6          0          4          2
      1634 TX    8978449     564400          6          0          3          2
       601 TX   15532043     430716          6          0     792823          2
      2739 TX    9830436     585243          6          0       5955          2
      1110 TX   12713992     705911          6          0       5057          2
      1738 TX   16187423     344322          6          0     202512          2
      3040 TX    9502753     586506          6          0      12690          2
      1796 TX    7602221     697164          6          0     254457          2
      1149 TX    7143454     442359          6          0        999          2
      2940 TX   14352418     620297          6          0     204583          2

137 rows selected.

most tx lock ,no corresponding tm locks.
i have confirmed that there are no table lock disabled tables in database.

SQL> select   * from dba_tables where table_lock!='ENABLED';

no rows selected

以上演示其实已经列出了引起该种现象的一种可能性,即某些表上是”disable table lock”的,即该表上不会出现任何类型的TM锁,这也就意味着该表在”Enable table lock”之前不能进行任何DDL修改;关于”disable table lock”的论述可以参见Steve Adams的<oracle8i internal services一书。
我在这里再列出几种可能的原因,权作抛砖引玉:
1. 事务中包括远程DML操作,最有可能的一种情况。
2. Remote select statement through dblink can also generate tx lock.远程select也会在本地产生TX锁.
3. 事务中使用savepoint :

SQL> drop table tv;

表已删除。

SQL> create table tv (t1 int);

表已创建。

SQL> savepoint a;

保存点已创建。

SQL> insert into tv values(1);

已创建 1 行。

SQL> rollback to a;

回退已完成。
SQL> select sid,type,lmode from v$lock where sid=(select distinct sid from v$my
tat) and type in ('TM','TX');

       SID TY      LMODE
---------- -- ----------
       125 TX          6
SQL> commit;

提交完成。
SQL> select sid,type,lmode from v$lock where sid=(select distinct sid from v$mys
tat) and type in ('TM','TX');

未选定行

4.使用dbms_transaction包的相关函数,如:

SQL> select sid,type,lmode from v$lock where sid =(select distinct sid from v$mystat) and type in ('TX','TM');
no rows selected
SQL> exec dbms_output.put_line( dbms_transaction.local_transaction_id(true) );

PL/SQL procedure successfully completed.

SQL> select sid,type,lmode from v$lock where sid =(select distinct sid from v$mystat) and type in ('TX','TM');

       SID TY      LMODE
---------- -- ----------
       159 TX          6

5. 事务中所涉及到的表启用了”disable table lock”,如:

SQL> desc apps.tv;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 T1                                                 NUMBER(38)

SQL> alter table apps.tv disable table lock;

Table altered.

SQL> insert into apps.tv values(2);

1 row created.

SQL> select sid,type,lmode from v$lock where sid =(select distinct sid from v$mystat) and type in ('TX','TM');

       SID TY      LMODE
---------- -- ----------
       159 TX          6

亦或者DML_LOCKS设置为0(DML_LOCKS参数设置为0,在9i以上版本会引起诸多问题,一般仅在测试或研究时修改该参数).

以下列出Metalink上,关于介绍如何troubleshooting 分布式事务的文档:

Script to report information about distributed transactions.

Product Name, Product Version Oracle Server, 7.3.x to 11.1
Platform Platform Independent
Date Created 12-OCT-1999

Instructions

Execution Environment:
SQL, SQL*Plus

Access Privileges:
Must be run by SYS , or a user with privileges to
select from x$k2gte , x$ktcxb , x$ksuse , and v$session_wait .

Usage:
sqlplus /nolog
SQL> connect sys/
SQL> @distri
SQL> @distri_details

Instructions:

Copy the first query into a file named distri.sql. Execute the script from sqlplus
connected to sys. If further information needs to be collected then copy the second
query to a file named distri_details.sql and run it while connected as sys.

PROOFREAD THIS SCRIPT BEFORE USING IT! Due to differences in the way text
editors, e-mail packages, and operating systems handle text formatting (spaces,
tabs, and carriage returns), this script may not be in an executable state
when you first receive it. Check over the script to ensure that errors of
this type are corrected.

Description

This script includes two queries to show information about active distributed
transactions. The structure of x$k2gte is slightly modified with Oracle8, so
Oracle7 users should use the commented lines of the queries.

o First Query : distri.sql
Is intended to show , for each active distributed transaction :

GTXID = Global Transaction ID
LSESSION= Local Session , this is the session in your local database
(ie: the database where you are connected and running this script)
S = Status of the LSESSION, it has 1 letter with this meaning :

First
—————–
I (inactive)
A (active)
C (cached)
S (sniped)
K (killed)

WAITING = what LSESSION is waiting for

ORIGIN = Machine-PID , Machine where the Client Tool is running,
PID is the Operating System Process Id of Client Process

You would want to run this SELECT to have a quick look at the active
distributed transactions.

o Second Query : distri_details.sql
Is intended to show more details about the active distributed transactions.
See the “Sample Output” below to get details.

You would want to run this SELECT when you need to see all the information
about a particular distributed transaction .

References

Note:62354.1 TX Transaction locks – Example wait scenarios

Script

REM distri.sql
column origin format a13
column GTXID format a35
column LSESSION format a10
column s format a1
column waiting format a15
Select /*+ ORDERED */
substr(s.ksusemnm,1,10)||’-‘|| substr(s.ksusepid,1,10) “ORIGIN”,
substr(g.K2GTITID_ORA,1,35) “GTXID”,
substr(s.indx,1,4)||’.’|| substr(s.ksuseser,1,5) “LSESSION” ,
substr(decode(bitand(ksuseidl,11),
1,’ACTIVE’,
0, decode(bitand(ksuseflg,4096),0,’INACTIVE’,’CACHED’),
2,’SNIPED’,
3,’SNIPED’, ‘KILLED’),1,1) “S”,
substr(event,1,10) “WAITING”
from x$k2gte g, x$ktcxb t, x$ksuse s, v$session_wait w
— where g.K2GTeXCB =t.ktcxbxba <= use this if running in Oracle7
where g.K2GTDXCB =t.ktcxbxba — comment out if running in Oracle8 or later
and g.K2GTDSES=t.ktcxbses
and s.addr=g.K2GTDSES
and w.sid=s.indx;

REM distri_details.sql
set headin off
select /*+ ORDERED */
‘—————————————-‘||’
Curent Time : ‘|| substr(to_char(sysdate,’dd-Mon-YYYY HH24.MI.SS’),1,22) ||’
‘||’GTXID=’||substr(g.K2GTITID_EXT,1,10) ||’
‘||’Ascii GTXID=’||g.K2GTITID_ORA ||’
‘||’Branch= ‘||g.K2GTIBID ||’
Client Process ID is ‘|| substr(s.ksusepid,1,10)||’
running in machine : ‘||substr(s.ksusemnm,1,80)||’
Local TX Id =’||substr(t.KXIDUSN||’.’||t.kXIDSLT||’.’||t.kXIDSQN,1,10) ||’
Local Session SID.SERIAL =’||substr(s.indx,1,4)||’.’|| s.ksuseser ||’
is : ‘||decode(bitand(ksuseidl,11),1,’ACTIVE’,0,
decode(bitand(ksuseflg,4096),0,’INACTIVE’,’CACHED’),
2,’SNIPED’,3,’SNIPED’, ‘KILLED’) ||
‘ and ‘|| substr(STATE,1,9)||
‘ since ‘|| to_char(SECONDS_IN_WAIT,’9999′)||’ seconds’ ||’
Wait Event is :’||’
‘|| substr(event,1,30)||’ ‘||p1text||’=’||p1
||’,’||p2text||’=’||p2
||’,’||p3text||’=’||p3 ||’
Waited ‘||to_char(SEQ#,’99999′)||’ times ‘||’
Server for this session:’ ||decode(s.ksspatyp,1,’Dedicated Server’,
2,’Shared Server’,3,
‘PSE’,’None’) “Server”
from x$k2gte g, x$ktcxb t, x$ksuse s, v$session_wait w
— where g.K2GTeXCB =t.ktcxbxba <= use this if running Oracle7
where g.K2GTDXCB =t.ktcxbxba — comment out if running Oracle8 or later
and g.K2GTDSES=t.ktcxbses
and s.addr=g.K2GTDSES
and w.sid=s.indx;
set headin on
— end script

Sample Output

For all the samples, we have 2 machines :

esibm1 with a database V73_0.WORLD
and
esosf2 with a database V734.WORLD

In esibm1 we have a public database link V734 using ‘V734.WORLD’;
and in esosf2 we have a public database link V73_0 using ‘V73_0.WORLD’;

Sample-1 :
~~~~~~~~~~

-in esibm1: sqlplus scott/tiger

select * from dual@V734 for update ;

-in esibm1: sqlplus sys/manager @distri

ORIGIN

GTXID

LSESSION

S

WAITING

esibm1-23658

V73_0.WORLD.ea8e383e.2.45.4446

9.267

I

SQL*Net-m

ORIGIN= esibm1-23658 , this means the Client – sqlplus in this case – is
running in machine esibm1 and it is the process with PID= 23658

GTXID = V73_0.WORLD.ea8e383e.2.45.4446 is the Global Transaction Identifier,
it identifies the database where the distributed transaction started,
it is the database where the Client is directly connected.
V73_0 = db_name
WORLD = db_domain
ea8e383e = reverse Db Id ( DataBase ID, it is calculated when you create
^^^^^^^^ the database, you can see this DB Id in a dump of Control
File, for example, for V73_0 it is :

DUMP OF CONTROL FILES, Seq # 7205 = 0x1c25
FILE HEADER:
Software vsn=120586240=0x7300000, Compatibility Vsn=120586240=0x7300000
Db Id=3817072814=0xe383e8ae, Db Name=’V73_0′
^^^^^^^^^^
2.45.4446 = Local Transaction ID in esibm1 ,
This session ( LSESSION) will keep an eXclusive lock on
a local enqueue “TX-2.45-4446”
See Note-62354.1

LSESSION = 9.267 , this is the V$SESSION.ID and V$SESSION.SERIAL# of the
session you are using in V73_0 database in esibm1.
( you can see all session details in esibm1 with :
select * from v$session where sid=9 and serial#=267 ; )

S = I , this is the State, it is I for Inactive .

WAITING , this is what this LSESSION is waiting for, in this case it is
waiting for ‘SQL*Net message from client’ , it is the standard wait
when we are waiting for Client to give us work to do .

-in esosf2 : sqlplus sys/manager @distri

ORIGIN

GTID

LSESSION

S

WAITING

esibm1-23658

V73_0.WORLD.ea8e383e.2.45.4446

11.1013

I

SQL*Net m

LSESSION = 11.1013 , this is the V$SESSION.ID and V$SESSION.SERIAL# of the
session you are using in V734 database in esosf2.
( you can see all session details with
select * from v$session where sid=11 and serial#=1013 ; )

Sample-2
~~~~~~~~~
-in esosf2 : sqlplus scott/tiger
select * from dept for update ;

-in esibm1 : sqlplus scott/tiger
select * from dept@V734 for update
.
In this case, the distributed transaction will have to wait because it tries
to lock dept’s rows which are locked by other transaction. We use the SELECT
distri_details.sql to show what happens :

– in esibm1 : sqlplus sys/manager@distri_details

This shows the session in the database where the Client is directly connected,
it is using a dblink, and it is waiting a message from dblink.

—————————————-
Curent Time : 06-Jun-2000 12.50.34
GTXID=5637335F30
Ascii GTXID=V73_0.WORLD.ea8e383e.2.21.4446
Branch= 0000
Client Process ID is 23658
running in machine : esibm1
Local TX Id =2.21.4446
Local Session SID.SERIAL =9.267
is : ACTIVE and WAITING since 0 seconds
Wait Event is :
SQL*Net message from dblink driver id=1446458164,#bytes=1,=0
Waited 207 times

Server for this session:Dedicated Server

– in esosf2 : sqlplus sys/manager @distri_details

This shows the session in the remote database, it is waiting to lock
dept’s rows ( waiting for an TX enqueue , see Note:62354.1 )

—————————————-
Curent Time : 06-Jun-2000 13.09.38
GTXID=5637335F30
Ascii GTXID=V73_0.WORLD.ea8e383e.2.21.4446
Branch= 020015005E1100000104
Client Process ID is 23658
running in machine : esibm1
Local TX Id =0.0.0
Local Session SID.SERIAL =11.1013
is : ACTIVE and WAITING since 6 seconds
Wait Event is :
enqueue name|mode=1415053318,id1=262158,id2=2611
Waited 122 times

Server for this session:Shared Server


Distributed Database, Transactions and Two Phase Commit [ID 13229.1]

What is the Distributed Database Option?
—————————————-
A distributed system is one in which both data and transaction processing
are divided between one or more computers connected by a network, each
computer playing a specific role in the system.

This configuration has multiple databases, each of which is accessed
directly by a single server and can be accessed indirectly by other
instances through server/server cooperation. Distributed systems
allow you to have data physically located at several sites, and
each site can transparently access all of the data.

Each node can be used for database processing, but the data is
permanently partitioned among the nodes. Several smaller server
machines can be cheaper and more flexible than one large,
centrally located server.

The key goals of a distributed database system are availability,
accuracy, concurrency, and recoverability.

The Client-Server Model and Distributed Systems
———————————————–
The client-server model is basic to distributed systems. It is a
response to the limitations presented by the traditional mainframe
client-host model, in which a single mainframe provides shared data
access to many dumb terminals. The client-server model is also a
response to the local area network (LAN) model, in which many isolated
systems access a file server that provides no processing power.

Client-server architecture provides integration of data and services
and allows clients to be isolated from inherent complexities, such as
communication protocols. The simplicity of the client-server architecture
allows clients to make requests that are routed to the appropriate server.
These requests are made in the form of transactions. Client transactions
are often SQL or PL/SQL procedures and functions that access individual
databases and services.

Distributed Database Characteristics
————————————

This article describes the twelve specifications for the ideal distributed
database management system and how ORACLE conforms to these specifications.

Oracle’s distributed architecture, comprising SQL*Net, Open Gateway and the
Oracle Server, provides an effective solution to the challenge of sharing
data in a networked environment.

The Oracle Server’s distributed architecture provides effective data sharing
in a networked environment using both client-server and distributed database
architectures.

In a client-server environment, communication takes place between two
processes that reside on different machines. The client executes the
application or application tool and sends requests to the server for data.
The received data is processed at the client machine. This is known as
distributed processing. The ideal distributed system should look like a
non-distributed system. Twelve specifications for the ideal distributed
database were developed by C.J. Date.

The Oracle Server supports most of the ideal distributed features.

1. Site Autonomy

Site autonomy means that each server participating in a distributed database is
administered independently from all other databases.
The data is owned and managed locally. Local operations remain purely local.
One site (node) in the distributed system does not depend on another site to
function successfully.

2. No reliance on a central site

All sites are treated as equals. Each site has its own data dictionary.

3. Continuous Operation

Incorporating a new site has no effect on existing applications and does not
disrupt service.

4. Location Independence

Users can retrieve and update data independent of the site.

5. Fragmentation Independence

Users can store parts of a table at different locations. Both horizontal and
vertical partitioning of data is possible.

6. Replication Independence

Stored copies of data can be located at multiple sites. Read-only snapshots
and updatable snapshots provide read-only and updatable copies of tables,
respectively. Symmetric Replication using triggers make readable and
writable replication possible.

7. Distributed Query Processing

Users can query a database residing on another node. The query is executed
at the node where the data is located.

8. Distributed Transaction Management

A transaction can update, insert or delete data from multiple databases.
The two-phase commit mechanism in Oracle ensures the integrity of distributed
transactions. Row level locking ensures a high level of data concurrency.

9. Hardware Independence

Oracle runs on all major hardware platforms.

10. Operating System Independence

A specific operating system is not required. Oracle runs under a variety
of operating systems.

11. Network Independence

The Oracle Server’s SQL*Net supports most popular networking software.
Network independence allows communication across homogeneous and heterogenous
networks. Oracle’s MultiProtocol Interchange enables applications to
communicate with databases across multiple network protocols.

12. DBMS Independence

DBMS Independence is the ability to integrate different databases. Open
Gateway supports connections to non-Oracle databases.

13. Distributed Database Security
The database supports all of the security features that are available with a
non-distributed database environment for distributed database systems, including:

Password authentication for users and roles
Some types of external authentication for users and roles including:
Kerberos version 5 for connected user links
DCE for connected user links

DISTRIBURTED TRANSACTIONS AND THE TWO PHASE COMMIT
==================================================

Two phase commit only comes in play during a commit of a distributed
transaction. The whole purpose is to maintain the integrity of the
“global” database. In other words, two phase commit guarantees that
everything will either commit or rollback.

TRANSACTION TYPES:

LOCAL TRANSACTION contains ONLY statements on the local node.

REMOTE TRANSACTION contains one or more statements which ALL
reference the same remote node.

DISTRIBUTED
TRANSACTION contains statements that modify data in
two or more distinct nodes.
* The only place where 2-PHASE COMMIT comes
into play.

TERMS:

CLIENTS are nodes that references information from another database
(C) server in a distributed transaction.

SERVERS are nodes that are directly referenced in a distributed
(S) transaction, or is requested to participate in a transaction
because another node requires data from it.

GLOBAL
COORDINATOR is the node in which the distributed transaction
(GC) originates.

LOCAL
COORDINATOR is the node that references data on other nodes to
(LC) complete its part in the distributed transaction.

COMMIT POINT
SITE is the site with the highest commit point strength
(CPS) “init.ora” parameter. It is usually the most critical
site that can not afford collisions in case of an
in-doubt transaction.

SCN is the system commit number.

SCN is essentially an internal database clock.
This is a monotonically increasing an unique number
for each transaction.

COMMIT_POINT_STRENGTH is the init.ora parameter that determines the
(CPstr) COMMIT POINT SITE.

When you attempt to commit a distributed transaction, you will enter
ORACLE’S TWO PHASE COMMIT MECHANISM.

TWO-PHASE COMMIT:

PREPARE PHASE:

1) Commit point site is determined.

2) Global coordinator asks all participating nodes
(except commit point site) to promise to COMMIT or
ROLLBACK the transaction regardless of failure).

This information is propagated by the local
coordinators. The servers have to be prepared before
the local or global coordinators (except the commit
point site). The local coordinator is responsible
for asking dependant nodes to prepare.

POSSIBLE RESPONSES FROM NODES:
* PREPARED
* ABORT
* READ-ONLY NODES

Locks obtained during the distributed transaction
will continue to be held.

Redo is flushed to the local redo logs.

3) Each node will pass back the SCN for his node.

4) Global Coordinator determines the max SCN.

After all the nodes have prepared successfully, we enter the commit phase.
All transactions except those found in the commit point site are “in-doubt”
until the commit point phase completes successfully.

COMMIT PHASE:

1) Global coordinator sends the max SCN to the commit point
site and asks it to commit.

2) Commit Point Site will try to commit or everything is
rolled back.

The locks are released in the commit point site first.

3) If committed/rolled back, the Commit Point Site will
inform the Global Coordinator which will commit/rolled back
at that time.

4) The information will propogate down to its clients/local
coordinators and they will commit/roll back and propagate
the information down to their servers until there are no
more servers.

(note: READ ONLY nodes do not participate in 2-phase commit.)

EXAMPLE:

UPDATE EMP SET SAL= SAL * 1.10 WHERE DEPTNO=10;
UPDATE EMP@HAWAII SET SAL = SAL*1.20 WHERE DEPNO=20;
BEFORE UPDATE TRIGGER Fires and updates
dept@newyork.
another trigger fires and inserts
dept_audit@paranoid.
after update trigger fires and inserts into the global
audit table, emp_audit@headquarters
INSERT INTO VACATION_TIME SELECT * FROM DAYS_OFF@HOME
COMMIT; (TWO-PHASE COMMIT BEGINS HERE)

_______________
| |
select days_off | HOME | SERVER /*READ-ONLY*/
|_____________| SCN=20000
| CPstr=250
|
_______________
| | GC/LC/CLIENT
update emp | LOCAL | SCN=100
|_____________| CPstr=125

/ \
/ \
____________ _____________
update emp | | | | insert emp_audit
SERVER/LC/CLIENT | HAWAII | | HQ | SERVER
SCN=250 |___________| |____________| SCN =1000
CPstr=95 / CPstr=10
/
/
————
| | inserts dept_audit
| PARANOID | SERVER
|__________| SCN=50
CPstr=205

PREPARE PHASE:
1) COMMIT POINT SITE IS PARANOID
The global coordinator will already know what the
commit point strength of each node prior to the commit.
Read-only nodes are not included.
2) All nodes except for PARANOID is asked to prepare.
3) HAWAII, the local coordinator, is responsible to ask her
dependent nodes to prepare before she prepares. In this
case, PARANOID is a commit point site; thus, it is ignored.
4) The highest SCN is sent to LOCAL node via the
local coordinators. The highest SCN is 1000.
5) All nodes which PREPARED will flush entries of the
transaction to the redo logs if not already done.

If any of the nodes send an “ABORT” message back, then the transaction is
rolled back at this time. Any failure after the PREPARE phase will result
with “in-doubt” transactions.

COMMIT PHASE:
1) PARANOID IS ASKED TO COMMIT OR ROLLBACK BY THE LOCAL (GC).
2) PARANOID commits at a SCN greater than 1000.
a) Redo is flushed.
b) Locks are released.
c) outcome is relayed back to the LOCAL node (GC).
Assume success:
3) AFter receiving the information, GC will commit at the
same SCN and pass the information to its dependents.
a) commit flushed to redo logs.
b) data locks are released.
c) GC will pass the information to HAWAII and HQ.
(1) They, in turn, will commit and HAWAII
will pass the information to PARANOID.

If all is successful, every statement will commit with the same SCN and then
RECO will delete the entries from “dba_2pc_pending” and “dba_2pc_neighbors”
tables. Afterwards, the nodes will “forget” the transaction.

Forget Phase
1) After the participating nodes notify the commit point site
that they have committed, the commit point site can forget
about the transaction. The following steps occur:

2) After receiving notice from the global coordinator that all
nodes have committed, the commit point site erases status
information about this transaction.

3)The commit point site informs the global coordinator that
it has erased the status information.

4)The global coordinator erases its own information about the
transaction.

TWO-PHASE COMMIT QUIZ/ANSWERS
=============================

1) What is the difference between remote transaction and distributed
transaction?

REMOTE TRANSACTION contains one or more statements which ALL
reference the same remote node.

DISTRIBUTED TRANSACTION contains statements that modify data in
two or more distinct nodes.
* The only place where 2-PHASE COMMIT comes into play.

2) When does two-phase commit come in play?

Two phase commit only comes in play during a commit of a distributed
transaction. The whole purpose is to maintain the integrity of the
“global” database. In other words, two phase commit guarantees that
everything will either commit or rollback.

3) Define distributed database.

An environment that includes multiple servers where users manipulate
data as if there is one “global database”. With the widening use of
heterogeneous hardware environments and multiple database servers, it has
become essential to maintain the integrity of this “global database”. Of
course, the distributed concept is usually transparent to the end user and
the application programmers.

Another defintion could be:
A distributed database is a database that is not stored in its entirety at
a single physical location. Rather, a distributed database is a database that
is stored across a network of locations that are connected via communication
links. A distributed database consists of a collection of sites or nodes
connected together into a communication network.

4) When is a transaction considered “in-doubt”?

After all the nodes have prepared successfully, we enter the commit phase.
All transactions except those found in the commit point site are “in-doubt”
until the commit point phase completes successfully.

5) Define the following terms: CLIENTS, DATABASE SERVERS, GLOBAL COORDINATOR,
LOCAL COORDINATOR, COMMIT POINT SITE

CLIENTS are nodes that references information from another database
(C) server in a distributed transaction.

SERVERS are nodes that are directly referenced in a distributed
(S) transaction, or is requested to participate in a transaction
because another node requires data from it.

GLOBAL COORDINATOR is the node in which the distributed transaction
(GC) originates.

LOCAL COORDINATOR is the node that references data on other nodes to
(LC) complete its part in the distributed transaction.

COMMIT POINT SITE is the site with the highest commit point strength
(CPS) init.ora parameter. It is usually the most critical
site that can not afford collisions in case of an
in-doubt transaction.

SCN is the system commit number.
SCN is essentially an internal database clock.
This is a monotonically increasing an unique number for
each transaction.

COMMIT_POINT_STRENGTH is the init.ora parameter that determines the
(CPstr) COMMIT POINT SITE.

6) Can the global coordinator be the commit point site as well?

Yes, the only prerequisites are that the node is part of the distributed
transaction (not read-only) and it has the highest commit point strength
parameter.

7) What distinguishes the commit point site?

COMMIT POINT SITE is the site with the highest commit point strength
(CPS) “init.ora” parameter. It is usually the most critical
site that can not afford collisions in case of an
in-doubt transaction.

8) What does RECO do?

RECO PROCESS
* gets information from the “DBA_2PC_PENDING” and
“DBA_2PC_NEIGHBORS” views.
* executes as the global or local coordinator
* tries to communicate wit other nodes in the in-doubt transaction
If it cannot connect, it will continue to try at exponential
intervals.
* resolves in-doubt transactions if the connection is up.
* uses the same DBLINK as the transaction
* removes resolved transactions from the pending transaction table.

9) What happens during the PREPARE phase? Please be as detailed as possible?

PREPARE PHASE:

1) Commit point site is determined.

2) Global coordinator asks all participating nodes
(except commit point site) to promise to COMMIT or
ROLLBACK the transaction regardless of failure).

This information is propagated by the local
coordinators. The servers have to be prepared before
the local or global coordinators (except the commit
point site). The local coordinator is responsible
for asking dependant nodes to prepare.

POSSIBLE RESPONSES FROM NODES:
* PREPARED
* ABORT
* READ-ONLY NODES

Locks obtained during the distributed transaction
will continue to be held.

Redo is flushed to the local redo logs.

3) Each node will pass back the SCN for his node.

4) Global Coordinator determines the max SCN.

After all the nodes have prepared successfully, we enter the commit phase.
All transactions except those found in the commit point site are “in-doubt”
until the commit point phase completes successfully.

10) What happens during the COMMIT phase? Please be as detailed as possible?

COMMIT PHASE:

1) Global coordinator sends the max SCN to the commit point
site and asks it to commit.

2) Commit Point Site will try to commit or everything is
rolled back.

The locks are released in the commit point site first.

3) If committed/rolled back, the Commit Point Site will
inform the Global Coordinator which will commit/rolled back
at that time.

4) The information will propogate down to its clients/local
coordinators and they will commit/roll back and propagate
the information down to their servers until there are no
more servers.

(NOTE: READ ONLY nodes do not participate in 2-phase commit.)

11) What is SCN and how is it significant in two-phase commit?

SCN is the system commit number.
SCN is essentially an internal database clock.
This is a monotonically increasing an unique number for
each transaction.

The entire distributed transaction will be committed at the same SCN.
This will allow some type of consistency during distributed recovery.

12) Which two views are helpful during MANUAL distributed recovery?

They are “dba_2pc_pending” and “dba_2pc_neighbors”.

13) When does RECO remove the entries from those views?

If all is successful, every statement will commit with the same SCN and then
RECO will delete the entries from “dba_2pc_pending” and “dba_2pc_neighbors”
tables. Afterwards, the nodes will “forget” the transaction.

14) List some cases where you may do MANUAL RECOVERY?

A dba may do some manual recovery if there was a network or system or
database crash and the “in-doubt” transactions are holding locks that
are crucial to other applications.

15) How should you pick your COMMIT POINT SITE?

COMMIT POINT SITE is the site with the highest commit point strength
(CPS) “init.ora” parameter. It is usually the most critical
site that can not afford collisions in case of a doubted transaction.
You need to coordinate with the other dba adminstrators to make that
decision.

Use this command to wake RECO up after that:

How to Troubleshoot Distributed Transactions [ID 100664.1]

Applies to:
Oracle Server – Enterprise Edition – Version: 8.1.5.0 to 10.2.0.1
Information in this document applies to any platform.
Checked for relevance on 13-June-2008 by Faye Todd.
Appearance updated, no changes to technical content 19-Feb-2007
Purpose
You need to resolve a distributed transaction error because of errors received.
But before you try to implement any other steps, please diagnose whether it is
or it is not a REAL distributed transaction.
Last Review Date
June 13, 2008
Instructions for the Reader
A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.
Troubleshooting Details
Steps to Troubleshoot Distributed Transactions

1. Check the alert.log for errors, like:

ORA-1591 “lock held by in-doubt distributed transaction %s”
ORA-2062 “distributed recovery received dbid x, expected y”
ORA-2068 “following severe error from %s%s”

2. Make sure that the network is up and all db_links are valid and operational.
V$DBLINK AND GV$DBLINK show the db_links currently used by distributed
transactions:

SVRMGR> desc v$dblink

Column Name Null? Type
—————————— ——– —-
DB_LINK VARCHAR2(128)
OWNER_ID NUMBER
LOGGED_ON VARCHAR2(3)
HETEROGENEOUS VARCHAR2(3)
PROTOCOL VARCHAR2(6)
OPEN_CURSORS NUMBER
IN_TRANSACTION VARCHAR2(3)
UPDATE_SENT VARCHAR2(3)
COMMIT_POINT_STRENGTH NUMBER

SVRMGR> desc gv$dblink

Column Name Null? Type
—————————— ——– —-
INST_ID NUMBER
DB_LINK VARCHAR2(128)
OWNER_ID NUMBER
LOGGED_ON VARCHAR2(3)
HETEROGENEOUS VARCHAR2(3)
PROTOCOL VARCHAR2(6)
OPEN_CURSORS NUMBER
IN_TRANSACTION VARCHAR2(3)
UPDATE_SENT VARCHAR2(3)
COMMIT_POINT_STRENGTH NUMBER

3. Run the following query against DBA_2PC_PENDING:

SELECT LOCAL_TRAN_ID, GLOBAL_TRAN_ID, STATE, MIXED, HOST, COMMIT#
FROM DBA_2PC_PENDING
WHERE LOCAL_TRAN_ID = ‘??.’; [Insert code here

LOCAL_TRAN_ID is the Local transactionID (number that is given when error
is reported). If LOCAL_TRAN_ID = GLOBAL_TRAN_ID, it means that this site
is the Global Coordinator, i.e. where the transaction originated from.

You can also get LOCAL_TRAN_ID from the alert.log.

4. Run the following query against DBA_2PC_NEIGHBORS view on all Nodes:

SELECT LOCAL_TRAN_ID, IN_OUT, DATABASE, INTERFACE
FROM DBA_2PC_NEIGHBORS;

5. Get the values for COMMIT_POINT_STRENGTH init.ora parameter on all Nodes.

If you are not aware of what other Nodes are participating in the distributed transaction, query the DBA_2PC_NEIGHBORS view. The Node with COMMIT_POINT_STRENGTH that has highest value in init.ora has to commit first.

6. Check the STATE column from DBA_2PC_PENDING.

If STATE column states COMMIT, it means the local database has successfully committed.

This also means you most likely do not have to force anything on that Node.
Or if other Nodes have not committed and this particular Node happens to be
a Global Coordinator – you need to note COMMIT# i.e. SCN number from
DBA_2PC_PENDING view, as this is THE SCN# you will need to use if you need
to perform COMMIT FORCE.

7. Take the GLOBAL_TRAN_ID and COMMIT# (SCN number from DBA_2PC_PENDING) and
compare it to the value on the other nodes.

On Nodes where there are no entries for this COMMIT# and transactionID in DBA_2PC_PENDING, RECO process has resolved the problem and we don’t have to do anything.

Note: RECO is always automatically started whenever Distributed Transactions are enabled.

If you issue a UNIX command : ps – ef | grep -i reco and see RECO running, distributed transactions are correctly enabled.

If you combine this output with values of the init.ora parameter DISTRIBUTED_TRANSACTONS (which should be higher than 0 for distributed environment), you may safely say that the customer is running in a distributed environment even if this particular customer has not set it up.

8. If the STATE (from DBA_2PC_PENDING view) is PREPARED, then this Node has not completed the transaction.

Take GLOBAL_TRAN_ID (from DBA_2PC_PENDING) and COMMIT# (SCN number) and compare it to the value on the other nodes.

You need to also check DBA_2PC_NEIGHBORS and see whether there are any other
Nodes that reference the same transactionID in their DBA_2PC_PENDING. If you
see it referenced, it means there are children and they need to be resolved
(either committed or rollback) as well. Please take ‘children’ i.e. Nodes
that other Nodes (not Global Coordinator) into account.

Note: for clarification on terminology please see Note 13229.1

9. If there are no children, it is safe to manually interfere and either commit force or rollback force.

For global integrity, use the same COMMIT# (SCN) when you force the transaction. To force the transaction please see SYNTAX NOTES below.

10. After the transaction is forced, make sure that both DBA_2PC_PENDING and DBA_2PC_NEIGHBORS data dictionary views are empty.

If for some reason it is necessary to clean those views, use packaged procedure DBMS_TRANSACTION.purge_lost_db_entry.

Note 1012842.102 ORA-2019 ORA-2058 ORA-2068 ORA-2050:
FAILED DISTRIBUTED TRANSACTIONS
_$#$_

Note: Be very careful with this procedure as it is always more reliable and more consistent to let RECO resolve distributed transaction conflicts. Please attempt to use DBMS_TRANSACTION.purge_lost_db_entry only as a
last resort after every other measure was implemented including restarting the database to resolve the locking conflict.

11. If even then there are entries in the views and errors still appear, please check for Bug 684157. This is fixed in 8.0.6 and higher.
Syntax Notes

COMMIT FORCE command

Example of syntax where highest committed SCN is 88123887 (from dba_2pc_pending) and local transaction ID is 1.13.5197 (from either dba_2pc_pending or alert.log):

SVRMGR> COMMIT FORCE ‘your local transactionID on this node’, ‘highest SCN from already committed site’;

SVRMGR> COMMIT FORCE ‘1.13.5197’, ‘88123887’;

ROLLBACK FORCE command

Example of syntax where local transactionID is 1.13.5197 (from either dba_2pc_pending or alert.log):

SVRMGR> ROLLBACK FORCE ‘your local transactionID on this node’;
SVRMGR> ROLLBACK FORCE ‘1.13.5197’;

PURGING VIEWS:

*
Example of syntax for lost local transactionID 1.13.5197 (from either dba_2pc_pending or alert.log).
NOTE: Run as SYS where 1.13.5197 is the local_tran_id

SVRMGR> Execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY (‘1.13.5197’);
COMMIT;

*
Example of syntax for mixed local transaction ID 1.13.5197 (from either dba_2pc_pending or alert.log). Note, Run as SYS:

SVRMGR> Execute DBMS_TRANSACTION.PURGE_MIXED (‘1.13.5197’);
COMMIT;

Run PURGE_MIXED Procedure only if significant reconfiguration has occurred so that automatic recover (RECO) cannot resolve the transaction.

Example of appropriate cases when it can be used:

– Total loss of the remote database, or reconfiguration of software resulting in loss of two-phase commit capability.

NOTE: A ‘commit;’ should be issued after each DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY
and DBMS_TRANSACTION.PURGE_MIXED otherwise errors are given.

SCN Recovery Steps

Complete Recovery:

1. At the down site recover completely if possible (treat as regular recovery).

2. SCN will appear in the alert.log after recover is done on the crashed Node.

Incomplete Recovery:

1. If time-based or cancel-based recovery was used on the crashed node, other sites must be placed back to the same point in time for global consistency. Get last SCN from the alert.log of the crashed node.

2. At each node perform a shutdown normal or immediate.

3. Take a cold backup.

4. Restore the control file if necessary.

5. Restore the last backup of all datafiles along with archived redo logs.

6. Choose which tool to use to perform SCN recovery – either Server Manager line mode or RMAN.

7. Connect internal, startup mount, check status from v$datafile to make sure all datafiles are online. Issue ‘Alter database datafile ‘?/?/?’ online;’, for each datafile with status offline, to bring it online.

8. Issue the following command using the latest SCN from alert.log on the Node that had to be recovered:

RECOVER DATABASE UNTIL CHANGE ‘1.13.5197’;

NOTE: If for some reason (e.g.when issuing commit force command) automatic recovery (RECO process) needs to be disabled, use the following command to put RECO to sleep:

ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *