numeric-linalg
Educational material on the SciPy implementation of numerical linear algebra algorithms
Name | Size | Mode | |
.. | |||
lapack/SRC/cgesvj.f | 57471B | -rw-r--r-- |
0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 0012 0013 0014 0015 0016 0017 0018 0019 0020 0021 0022 0023 0024 0025 0026 0027 0028 0029 0030 0031 0032 0033 0034 0035 0036 0037 0038 0039 0040 0041 0042 0043 0044 0045 0046 0047 0048 0049 0050 0051 0052 0053 0054 0055 0056 0057 0058 0059 0060 0061 0062 0063 0064 0065 0066 0067 0068 0069 0070 0071 0072 0073 0074 0075 0076 0077 0078 0079 0080 0081 0082 0083 0084 0085 0086 0087 0088 0089 0090 0091 0092 0093 0094 0095 0096 0097 0098 0099 0100 0101 0102 0103 0104 0105 0106 0107 0108 0109 0110 0111 0112 0113 0114 0115 0116 0117 0118 0119 0120 0121 0122 0123 0124 0125 0126 0127 0128 0129 0130 0131 0132 0133 0134 0135 0136 0137 0138 0139 0140 0141 0142 0143 0144 0145 0146 0147 0148 0149 0150 0151 0152 0153 0154 0155 0156 0157 0158 0159 0160 0161 0162 0163 0164 0165 0166 0167 0168 0169 0170 0171 0172 0173 0174 0175 0176 0177 0178 0179 0180 0181 0182 0183 0184 0185 0186 0187 0188 0189 0190 0191 0192 0193 0194 0195 0196 0197 0198 0199 0200 0201 0202 0203 0204 0205 0206 0207 0208 0209 0210 0211 0212 0213 0214 0215 0216 0217 0218 0219 0220 0221 0222 0223 0224 0225 0226 0227 0228 0229 0230 0231 0232 0233 0234 0235 0236 0237 0238 0239 0240 0241 0242 0243 0244 0245 0246 0247 0248 0249 0250 0251 0252 0253 0254 0255 0256 0257 0258 0259 0260 0261 0262 0263 0264 0265 0266 0267 0268 0269 0270 0271 0272 0273 0274 0275 0276 0277 0278 0279 0280 0281 0282 0283 0284 0285 0286 0287 0288 0289 0290 0291 0292 0293 0294 0295 0296 0297 0298 0299 0300 0301 0302 0303 0304 0305 0306 0307 0308 0309 0310 0311 0312 0313 0314 0315 0316 0317 0318 0319 0320 0321 0322 0323 0324 0325 0326 0327 0328 0329 0330 0331 0332 0333 0334 0335 0336 0337 0338 0339 0340 0341 0342 0343 0344 0345 0346 0347 0348 0349 0350 0351 0352 0353 0354 0355 0356 0357 0358 0359 0360 0361 0362 0363 0364 0365 0366 0367 0368 0369 0370 0371 0372 0373 0374 0375 0376 0377 0378 0379 0380 0381 0382 0383 0384 0385 0386 0387 0388 0389 0390 0391 0392 0393 0394 0395 0396 0397 0398 0399 0400 0401 0402 0403 0404 0405 0406 0407 0408 0409 0410 0411 0412 0413 0414 0415 0416 0417 0418 0419 0420 0421 0422 0423 0424 0425 0426 0427 0428 0429 0430 0431 0432 0433 0434 0435 0436 0437 0438 0439 0440 0441 0442 0443 0444 0445 0446 0447 0448 0449 0450 0451 0452 0453 0454 0455 0456 0457 0458 0459 0460 0461 0462 0463 0464 0465 0466 0467 0468 0469 0470 0471 0472 0473 0474 0475 0476 0477 0478 0479 0480 0481 0482 0483 0484 0485 0486 0487 0488 0489 0490 0491 0492 0493 0494 0495 0496 0497 0498 0499 0500 0501 0502 0503 0504 0505 0506 0507 0508 0509 0510 0511 0512 0513 0514 0515 0516 0517 0518 0519 0520 0521 0522 0523 0524 0525 0526 0527 0528 0529 0530 0531 0532 0533 0534 0535 0536 0537 0538 0539 0540 0541 0542 0543 0544 0545 0546 0547 0548 0549 0550 0551 0552 0553 0554 0555 0556 0557 0558 0559 0560 0561 0562 0563 0564 0565 0566 0567 0568 0569 0570 0571 0572 0573 0574 0575 0576 0577 0578 0579 0580 0581 0582 0583 0584 0585 0586 0587 0588 0589 0590 0591 0592 0593 0594 0595 0596 0597 0598 0599 0600 0601 0602 0603 0604 0605 0606 0607 0608 0609 0610 0611 0612 0613 0614 0615 0616 0617 0618 0619 0620 0621 0622 0623 0624 0625 0626 0627 0628 0629 0630 0631 0632 0633 0634 0635 0636 0637 0638 0639 0640 0641 0642 0643 0644 0645 0646 0647 0648 0649 0650 0651 0652 0653 0654 0655 0656 0657 0658 0659 0660 0661 0662 0663 0664 0665 0666 0667 0668 0669 0670 0671 0672 0673 0674 0675 0676 0677 0678 0679 0680 0681 0682 0683 0684 0685 0686 0687 0688 0689 0690 0691 0692 0693 0694 0695 0696 0697 0698 0699 0700 0701 0702 0703 0704 0705 0706 0707 0708 0709 0710 0711 0712 0713 0714 0715 0716 0717 0718 0719 0720 0721 0722 0723 0724 0725 0726 0727 0728 0729 0730 0731 0732 0733 0734 0735 0736 0737 0738 0739 0740 0741 0742 0743 0744 0745 0746 0747 0748 0749 0750 0751 0752 0753 0754 0755 0756 0757 0758 0759 0760 0761 0762 0763 0764 0765 0766 0767 0768 0769 0770 0771 0772 0773 0774 0775 0776 0777 0778 0779 0780 0781 0782 0783 0784 0785 0786 0787 0788 0789 0790 0791 0792 0793 0794 0795 0796 0797 0798 0799 0800 0801 0802 0803 0804 0805 0806 0807 0808 0809 0810 0811 0812 0813 0814 0815 0816 0817 0818 0819 0820 0821 0822 0823 0824 0825 0826 0827 0828 0829 0830 0831 0832 0833 0834 0835 0836 0837 0838 0839 0840 0841 0842 0843 0844 0845 0846 0847 0848 0849 0850 0851 0852 0853 0854 0855 0856 0857 0858 0859 0860 0861 0862 0863 0864 0865 0866 0867 0868 0869 0870 0871 0872 0873 0874 0875 0876 0877 0878 0879 0880 0881 0882 0883 0884 0885 0886 0887 0888 0889 0890 0891 0892 0893 0894 0895 0896 0897 0898 0899 0900 0901 0902 0903 0904 0905 0906 0907 0908 0909 0910 0911 0912 0913 0914 0915 0916 0917 0918 0919 0920 0921 0922 0923 0924 0925 0926 0927 0928 0929 0930 0931 0932 0933 0934 0935 0936 0937 0938 0939 0940 0941 0942 0943 0944 0945 0946 0947 0948 0949 0950 0951 0952 0953 0954 0955 0956 0957 0958 0959 0960 0961 0962 0963 0964 0965 0966 0967 0968 0969 0970 0971 0972 0973 0974 0975 0976 0977 0978 0979 0980 0981 0982 0983 0984 0985 0986 0987 0988 0989 0990 0991 0992 0993 0994 0995 0996 0997 0998 0999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478
*> \brief <b> CGESVJ </b> * * =========== DOCUMENTATION =========== * * Online html documentation available at * http://www.netlib.org/lapack/explore-html/ * *> \htmlonly *> Download CGESVJ + dependencies *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/cgesvj.f"> *> [TGZ]</a> *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/cgesvj.f"> *> [ZIP]</a> *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/cgesvj.f"> *> [TXT]</a> *> \endhtmlonly * * Definition: * =========== * * SUBROUTINE CGESVJ( JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V, * LDV, CWORK, LWORK, RWORK, LRWORK, INFO ) * * .. Scalar Arguments .. * INTEGER INFO, LDA, LDV, LWORK, LRWORK, M, MV, N * CHARACTER*1 JOBA, JOBU, JOBV * .. * .. Array Arguments .. * COMPLEX A( LDA, * ), V( LDV, * ), CWORK( LWORK ) * REAL RWORK( LRWORK ), SVA( N ) * .. * * *> \par Purpose: * ============= *> *> \verbatim *> *> CGESVJ computes the singular value decomposition (SVD) of a complex *> M-by-N matrix A, where M >= N. The SVD of A is written as *> [++] [xx] [x0] [xx] *> A = U * SIGMA * V^*, [++] = [xx] * [ox] * [xx] *> [++] [xx] *> where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal *> matrix, and V is an N-by-N unitary matrix. The diagonal elements *> of SIGMA are the singular values of A. The columns of U and V are the *> left and the right singular vectors of A, respectively. *> \endverbatim * * Arguments: * ========== * *> \param[in] JOBA *> \verbatim *> JOBA is CHARACTER*1 *> Specifies the structure of A. *> = 'L': The input matrix A is lower triangular; *> = 'U': The input matrix A is upper triangular; *> = 'G': The input matrix A is general M-by-N matrix, M >= N. *> \endverbatim *> *> \param[in] JOBU *> \verbatim *> JOBU is CHARACTER*1 *> Specifies whether to compute the left singular vectors *> (columns of U): *> = 'U' or 'F': The left singular vectors corresponding to the nonzero *> singular values are computed and returned in the leading *> columns of A. See more details in the description of A. *> The default numerical orthogonality threshold is set to *> approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E'). *> = 'C': Analogous to JOBU='U', except that user can control the *> level of numerical orthogonality of the computed left *> singular vectors. TOL can be set to TOL = CTOL*EPS, where *> CTOL is given on input in the array WORK. *> No CTOL smaller than ONE is allowed. CTOL greater *> than 1 / EPS is meaningless. The option 'C' *> can be used if M*EPS is satisfactory orthogonality *> of the computed left singular vectors, so CTOL=M could *> save few sweeps of Jacobi rotations. *> See the descriptions of A and WORK(1). *> = 'N': The matrix U is not computed. However, see the *> description of A. *> \endverbatim *> *> \param[in] JOBV *> \verbatim *> JOBV is CHARACTER*1 *> Specifies whether to compute the right singular vectors, that *> is, the matrix V: *> = 'V' or 'J': the matrix V is computed and returned in the array V *> = 'A': the Jacobi rotations are applied to the MV-by-N *> array V. In other words, the right singular vector *> matrix V is not computed explicitly; instead it is *> applied to an MV-by-N matrix initially stored in the *> first MV rows of V. *> = 'N': the matrix V is not computed and the array V is not *> referenced *> \endverbatim *> *> \param[in] M *> \verbatim *> M is INTEGER *> The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0. *> \endverbatim *> *> \param[in] N *> \verbatim *> N is INTEGER *> The number of columns of the input matrix A. *> M >= N >= 0. *> \endverbatim *> *> \param[in,out] A *> \verbatim *> A is COMPLEX array, dimension (LDA,N) *> On entry, the M-by-N matrix A. *> On exit, *> If JOBU = 'U' .OR. JOBU = 'C': *> If INFO = 0 : *> RANKA orthonormal columns of U are returned in the *> leading RANKA columns of the array A. Here RANKA <= N *> is the number of computed singular values of A that are *> above the underflow threshold SLAMCH('S'). The singular *> vectors corresponding to underflowed or zero singular *> values are not computed. The value of RANKA is returned *> in the array RWORK as RANKA=NINT(RWORK(2)). Also see the *> descriptions of SVA and RWORK. The computed columns of U *> are mutually numerically orthogonal up to approximately *> TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU = 'C'), *> see the description of JOBU. *> If INFO > 0, *> the procedure CGESVJ did not converge in the given number *> of iterations (sweeps). In that case, the computed *> columns of U may not be orthogonal up to TOL. The output *> U (stored in A), SIGMA (given by the computed singular *> values in SVA(1:N)) and V is still a decomposition of the *> input matrix A in the sense that the residual *> || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small. *> If JOBU = 'N': *> If INFO = 0 : *> Note that the left singular vectors are 'for free' in the *> one-sided Jacobi SVD algorithm. However, if only the *> singular values are needed, the level of numerical *> orthogonality of U is not an issue and iterations are *> stopped when the columns of the iterated matrix are *> numerically orthogonal up to approximately M*EPS. Thus, *> on exit, A contains the columns of U scaled with the *> corresponding singular values. *> If INFO > 0 : *> the procedure CGESVJ did not converge in the given number *> of iterations (sweeps). *> \endverbatim *> *> \param[in] LDA *> \verbatim *> LDA is INTEGER *> The leading dimension of the array A. LDA >= max(1,M). *> \endverbatim *> *> \param[out] SVA *> \verbatim *> SVA is REAL array, dimension (N) *> On exit, *> If INFO = 0 : *> depending on the value SCALE = RWORK(1), we have: *> If SCALE = ONE: *> SVA(1:N) contains the computed singular values of A. *> During the computation SVA contains the Euclidean column *> norms of the iterated matrices in the array A. *> If SCALE .NE. ONE: *> The singular values of A are SCALE*SVA(1:N), and this *> factored representation is due to the fact that some of the *> singular values of A might underflow or overflow. *> *> If INFO > 0 : *> the procedure CGESVJ did not converge in the given number of *> iterations (sweeps) and SCALE*SVA(1:N) may not be accurate. *> \endverbatim *> *> \param[in] MV *> \verbatim *> MV is INTEGER *> If JOBV = 'A', then the product of Jacobi rotations in CGESVJ *> is applied to the first MV rows of V. See the description of JOBV. *> \endverbatim *> *> \param[in,out] V *> \verbatim *> V is COMPLEX array, dimension (LDV,N) *> If JOBV = 'V', then V contains on exit the N-by-N matrix of *> the right singular vectors; *> If JOBV = 'A', then V contains the product of the computed right *> singular vector matrix and the initial matrix in *> the array V. *> If JOBV = 'N', then V is not referenced. *> \endverbatim *> *> \param[in] LDV *> \verbatim *> LDV is INTEGER *> The leading dimension of the array V, LDV >= 1. *> If JOBV = 'V', then LDV >= max(1,N). *> If JOBV = 'A', then LDV >= max(1,MV) . *> \endverbatim *> *> \param[in,out] CWORK *> \verbatim *> CWORK is COMPLEX array, dimension (max(1,LWORK)) *> Used as workspace. *> \endverbatim *> *> \param[in] LWORK *> \verbatim *> LWORK is INTEGER. *> Length of CWORK. *> LWORK >= 1, if MIN(M,N) = 0, and LWORK >= M+N, otherwise. *> *> If on entry LWORK = -1, then a workspace query is assumed and *> no computation is done; CWORK(1) is set to the minial (and optimal) *> length of CWORK. *> \endverbatim *> *> \param[in,out] RWORK *> \verbatim *> RWORK is REAL array, dimension (max(6,LRWORK)) *> On entry, *> If JOBU = 'C' : *> RWORK(1) = CTOL, where CTOL defines the threshold for convergence. *> The process stops if all columns of A are mutually *> orthogonal up to CTOL*EPS, EPS=SLAMCH('E'). *> It is required that CTOL >= ONE, i.e. it is not *> allowed to force the routine to obtain orthogonality *> below EPSILON. *> On exit, *> RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N) *> are the computed singular values of A. *> (See description of SVA().) *> RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero *> singular values. *> RWORK(3) = NINT(RWORK(3)) is the number of the computed singular *> values that are larger than the underflow threshold. *> RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi *> rotations needed for numerical convergence. *> RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep. *> This is useful information in cases when CGESVJ did *> not converge, as it can be used to estimate whether *> the output is still useful and for post festum analysis. *> RWORK(6) = the largest absolute value over all sines of the *> Jacobi rotation angles in the last sweep. It can be *> useful for a post festum analysis. *> \endverbatim *> *> \param[in] LRWORK *> \verbatim *> LRWORK is INTEGER *> Length of RWORK. *> LRWORK >= 1, if MIN(M,N) = 0, and LRWORK >= MAX(6,N), otherwise *> *> If on entry LRWORK = -1, then a workspace query is assumed and *> no computation is done; RWORK(1) is set to the minial (and optimal) *> length of RWORK. *> \endverbatim *> *> \param[out] INFO *> \verbatim *> INFO is INTEGER *> = 0: successful exit. *> < 0: if INFO = -i, then the i-th argument had an illegal value *> > 0: CGESVJ did not converge in the maximal allowed number *> (NSWEEP=30) of sweeps. The output may still be useful. *> See the description of RWORK. *> \endverbatim *> * Authors: * ======== * *> \author Univ. of Tennessee *> \author Univ. of California Berkeley *> \author Univ. of Colorado Denver *> \author NAG Ltd. * *> \ingroup gesvj * *> \par Further Details: * ===================== *> *> \verbatim *> *> The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane *> rotations. In the case of underflow of the tangent of the Jacobi angle, a *> modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses *> column interchanges of de Rijk [1]. The relative accuracy of the computed *> singular values and the accuracy of the computed singular vectors (in *> angle metric) is as guaranteed by the theory of Demmel and Veselic [2]. *> The condition number that determines the accuracy in the full rank case *> is essentially min_{D=diag} kappa(A*D), where kappa(.) is the *> spectral condition number. The best performance of this Jacobi SVD *> procedure is achieved if used in an accelerated version of Drmac and *> Veselic [4,5], and it is the kernel routine in the SIGMA library [6]. *> Some tuning parameters (marked with [TP]) are available for the *> implementer. *> The computational range for the nonzero singular values is the machine *> number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even *> denormalized singular values can be computed with the corresponding *> gradual loss of accurate digits. *> \endverbatim * *> \par Contributor: * ================== *> *> \verbatim *> *> ============ *> *> Zlatko Drmac (Zagreb, Croatia) *> *> \endverbatim * *> \par References: * ================ *> *> \verbatim *> *> [1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the *> singular value decomposition on a vector computer. *> SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371. *> [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR. *> [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular *> value computation in floating point arithmetic. *> SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222. *> [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. *> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. *> LAPACK Working note 169. *> [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. *> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. *> LAPACK Working note 170. *> [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, *> QSVD, (H,K)-SVD computations. *> Department of Mathematics, University of Zagreb, 2008, 2015. *> \endverbatim * *> \par Bugs, examples and comments: * ================================= *> *> \verbatim *> =========================== *> Please report all bugs and send interesting test examples and comments to *> drmac@math.hr. Thank you. *> \endverbatim *> * ===================================================================== SUBROUTINE CGESVJ( JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V, $ LDV, CWORK, LWORK, RWORK, LRWORK, INFO ) * * -- LAPACK computational routine -- * -- LAPACK is a software package provided by Univ. of Tennessee, -- * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..-- * IMPLICIT NONE * .. Scalar Arguments .. INTEGER INFO, LDA, LDV, LWORK, LRWORK, M, MV, N CHARACTER*1 JOBA, JOBU, JOBV * .. * .. Array Arguments .. COMPLEX A( LDA, * ), V( LDV, * ), CWORK( LWORK ) REAL RWORK( LRWORK ), SVA( N ) * .. * * ===================================================================== * * .. Local Parameters .. REAL ZERO, HALF, ONE PARAMETER ( ZERO = 0.0E0, HALF = 0.5E0, ONE = 1.0E0) COMPLEX CZERO, CONE PARAMETER ( CZERO = (0.0E0, 0.0E0), CONE = (1.0E0, 0.0E0) ) INTEGER NSWEEP PARAMETER ( NSWEEP = 30 ) * .. * .. Local Scalars .. COMPLEX AAPQ, OMPQ REAL AAPP, AAPP0, AAPQ1, AAQQ, APOAQ, AQOAP, BIG, $ BIGTHETA, CS, CTOL, EPSLN, MXAAPQ, $ MXSINJ, ROOTBIG, ROOTEPS, ROOTSFMIN, ROOTTOL, $ SKL, SFMIN, SMALL, SN, T, TEMP1, THETA, THSIGN, TOL INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1, $ ISWROT, jbc, jgl, KBL, LKAHEAD, MVL, N2, N34, $ N4, NBL, NOTROT, p, PSKIPPED, q, ROWSKIP, SWBAND, $ MINMN, LWMIN, LRWMIN LOGICAL APPLV, GOSCALE, LOWER, LQUERY, LSVEC, NOSCALE, ROTOK, $ RSVEC, UCTOL, UPPER * .. * .. * .. Intrinsic Functions .. INTRINSIC ABS, MAX, MIN, CONJG, REAL, SIGN, SQRT * .. * .. External Functions .. * .. * from BLAS REAL SCNRM2 COMPLEX CDOTC EXTERNAL CDOTC, SCNRM2 INTEGER ISAMAX EXTERNAL ISAMAX * from LAPACK REAL SLAMCH, SROUNDUP_LWORK EXTERNAL SLAMCH, SROUNDUP_LWORK LOGICAL LSAME EXTERNAL LSAME * .. * .. External Subroutines .. * .. * from BLAS EXTERNAL CCOPY, CROT, CSSCAL, CSWAP, CAXPY * from LAPACK EXTERNAL CLASCL, CLASET, CLASSQ, SLASCL, $ XERBLA EXTERNAL CGSVJ0, CGSVJ1 * .. * .. Executable Statements .. * * Test the input arguments * LSVEC = LSAME( JOBU, 'U' ) .OR. LSAME( JOBU, 'F' ) UCTOL = LSAME( JOBU, 'C' ) RSVEC = LSAME( JOBV, 'V' ) .OR. LSAME( JOBV, 'J' ) APPLV = LSAME( JOBV, 'A' ) UPPER = LSAME( JOBA, 'U' ) LOWER = LSAME( JOBA, 'L' ) * MINMN = MIN( M, N ) IF( MINMN.EQ.0 ) THEN LWMIN = 1 LRWMIN = 1 ELSE LWMIN = M + N LRWMIN = MAX( 6, N ) END IF * LQUERY = ( LWORK.EQ.-1 ) .OR. ( LRWORK.EQ.-1 ) IF( .NOT.( UPPER .OR. LOWER .OR. LSAME( JOBA, 'G' ) ) ) THEN INFO = -1 ELSE IF( .NOT.( LSVEC .OR. $ UCTOL .OR. $ LSAME( JOBU, 'N' ) ) ) THEN INFO = -2 ELSE IF( .NOT.( RSVEC .OR. $ APPLV .OR. $ LSAME( JOBV, 'N' ) ) ) THEN INFO = -3 ELSE IF( M.LT.0 ) THEN INFO = -4 ELSE IF( ( N.LT.0 ) .OR. ( N.GT.M ) ) THEN INFO = -5 ELSE IF( LDA.LT.M ) THEN INFO = -7 ELSE IF( MV.LT.0 ) THEN INFO = -9 ELSE IF( ( RSVEC .AND. ( LDV.LT.N ) ) .OR. $ ( APPLV .AND. ( LDV.LT.MV ) ) ) THEN INFO = -11 ELSE IF( UCTOL .AND. ( RWORK( 1 ).LE.ONE ) ) THEN INFO = -12 ELSE IF( LWORK.LT.LWMIN .AND. ( .NOT.LQUERY ) ) THEN INFO = -13 ELSE IF( LRWORK.LT.LRWMIN .AND. ( .NOT.LQUERY ) ) THEN INFO = -15 ELSE INFO = 0 END IF * * #:( IF( INFO.NE.0 ) THEN CALL XERBLA( 'CGESVJ', -INFO ) RETURN ELSE IF( LQUERY ) THEN CWORK( 1 ) = SROUNDUP_LWORK( LWMIN ) RWORK( 1 ) = SROUNDUP_LWORK( LRWMIN ) RETURN END IF * * #:) Quick return for void matrix * IF( MINMN.EQ.0 ) RETURN * * Set numerical parameters * The stopping criterion for Jacobi rotations is * * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS * * where EPS is the round-off and CTOL is defined as follows: * IF( UCTOL ) THEN * ... user controlled CTOL = RWORK( 1 ) ELSE * ... default IF( LSVEC .OR. RSVEC .OR. APPLV ) THEN CTOL = SQRT( REAL( M ) ) ELSE CTOL = REAL( M ) END IF END IF * ... and the machine dependent parameters are *[!] (Make sure that SLAMCH() works properly on the target machine.) * EPSLN = SLAMCH( 'Epsilon' ) ROOTEPS = SQRT( EPSLN ) SFMIN = SLAMCH( 'SafeMinimum' ) ROOTSFMIN = SQRT( SFMIN ) SMALL = SFMIN / EPSLN * BIG = SLAMCH( 'Overflow' ) BIG = ONE / SFMIN ROOTBIG = ONE / ROOTSFMIN * LARGE = BIG / SQRT( REAL( M*N ) ) BIGTHETA = ONE / ROOTEPS * TOL = CTOL*EPSLN ROOTTOL = SQRT( TOL ) * IF( REAL( M )*EPSLN.GE.ONE ) THEN INFO = -4 CALL XERBLA( 'CGESVJ', -INFO ) RETURN END IF * * Initialize the right singular vector matrix. * IF( RSVEC ) THEN MVL = N CALL CLASET( 'A', MVL, N, CZERO, CONE, V, LDV ) ELSE IF( APPLV ) THEN MVL = MV END IF RSVEC = RSVEC .OR. APPLV * * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N ) *(!) If necessary, scale A to protect the largest singular value * from overflow. It is possible that saving the largest singular * value destroys the information about the small ones. * This initial scaling is almost minimal in the sense that the * goal is to make sure that no column norm overflows, and that * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries * in A are detected, the procedure returns with INFO=-6. * SKL = ONE / SQRT( REAL( M )*REAL( N ) ) NOSCALE = .TRUE. GOSCALE = .TRUE. * IF( LOWER ) THEN * the input matrix is M-by-N lower triangular (trapezoidal) DO 1874 p = 1, N AAPP = ZERO AAQQ = ONE CALL CLASSQ( M-p+1, A( p, p ), 1, AAPP, AAQQ ) IF( AAPP.GT.BIG ) THEN INFO = -6 CALL XERBLA( 'CGESVJ', -INFO ) RETURN END IF AAQQ = SQRT( AAQQ ) IF( ( AAPP.LT.( BIG / AAQQ ) ) .AND. NOSCALE ) THEN SVA( p ) = AAPP*AAQQ ELSE NOSCALE = .FALSE. SVA( p ) = AAPP*( AAQQ*SKL ) IF( GOSCALE ) THEN GOSCALE = .FALSE. DO 1873 q = 1, p - 1 SVA( q ) = SVA( q )*SKL 1873 CONTINUE END IF END IF 1874 CONTINUE ELSE IF( UPPER ) THEN * the input matrix is M-by-N upper triangular (trapezoidal) DO 2874 p = 1, N AAPP = ZERO AAQQ = ONE CALL CLASSQ( p, A( 1, p ), 1, AAPP, AAQQ ) IF( AAPP.GT.BIG ) THEN INFO = -6 CALL XERBLA( 'CGESVJ', -INFO ) RETURN END IF AAQQ = SQRT( AAQQ ) IF( ( AAPP.LT.( BIG / AAQQ ) ) .AND. NOSCALE ) THEN SVA( p ) = AAPP*AAQQ ELSE NOSCALE = .FALSE. SVA( p ) = AAPP*( AAQQ*SKL ) IF( GOSCALE ) THEN GOSCALE = .FALSE. DO 2873 q = 1, p - 1 SVA( q ) = SVA( q )*SKL 2873 CONTINUE END IF END IF 2874 CONTINUE ELSE * the input matrix is M-by-N general dense DO 3874 p = 1, N AAPP = ZERO AAQQ = ONE CALL CLASSQ( M, A( 1, p ), 1, AAPP, AAQQ ) IF( AAPP.GT.BIG ) THEN INFO = -6 CALL XERBLA( 'CGESVJ', -INFO ) RETURN END IF AAQQ = SQRT( AAQQ ) IF( ( AAPP.LT.( BIG / AAQQ ) ) .AND. NOSCALE ) THEN SVA( p ) = AAPP*AAQQ ELSE NOSCALE = .FALSE. SVA( p ) = AAPP*( AAQQ*SKL ) IF( GOSCALE ) THEN GOSCALE = .FALSE. DO 3873 q = 1, p - 1 SVA( q ) = SVA( q )*SKL 3873 CONTINUE END IF END IF 3874 CONTINUE END IF * IF( NOSCALE )SKL = ONE * * Move the smaller part of the spectrum from the underflow threshold *(!) Start by determining the position of the nonzero entries of the * array SVA() relative to ( SFMIN, BIG ). * AAPP = ZERO AAQQ = BIG DO 4781 p = 1, N IF( SVA( p ).NE.ZERO )AAQQ = MIN( AAQQ, SVA( p ) ) AAPP = MAX( AAPP, SVA( p ) ) 4781 CONTINUE * * #:) Quick return for zero matrix * IF( AAPP.EQ.ZERO ) THEN IF( LSVEC )CALL CLASET( 'G', M, N, CZERO, CONE, A, LDA ) RWORK( 1 ) = ONE RWORK( 2 ) = ZERO RWORK( 3 ) = ZERO RWORK( 4 ) = ZERO RWORK( 5 ) = ZERO RWORK( 6 ) = ZERO RETURN END IF * * #:) Quick return for one-column matrix * IF( N.EQ.1 ) THEN IF( LSVEC )CALL CLASCL( 'G', 0, 0, SVA( 1 ), SKL, M, 1, $ A( 1, 1 ), LDA, IERR ) RWORK( 1 ) = ONE / SKL IF( SVA( 1 ).GE.SFMIN ) THEN RWORK( 2 ) = ONE ELSE RWORK( 2 ) = ZERO END IF RWORK( 3 ) = ZERO RWORK( 4 ) = ZERO RWORK( 5 ) = ZERO RWORK( 6 ) = ZERO RETURN END IF * * Protect small singular values from underflow, and try to * avoid underflows/overflows in computing Jacobi rotations. * SN = SQRT( SFMIN / EPSLN ) TEMP1 = SQRT( BIG / REAL( N ) ) IF( ( AAPP.LE.SN ) .OR. ( AAQQ.GE.TEMP1 ) .OR. $ ( ( SN.LE.AAQQ ) .AND. ( AAPP.LE.TEMP1 ) ) ) THEN TEMP1 = MIN( BIG, TEMP1 / AAPP ) * AAQQ = AAQQ*TEMP1 * AAPP = AAPP*TEMP1 ELSE IF( ( AAQQ.LE.SN ) .AND. ( AAPP.LE.TEMP1 ) ) THEN TEMP1 = MIN( SN / AAQQ, BIG / ( AAPP*SQRT( REAL( N ) ) ) ) * AAQQ = AAQQ*TEMP1 * AAPP = AAPP*TEMP1 ELSE IF( ( AAQQ.GE.SN ) .AND. ( AAPP.GE.TEMP1 ) ) THEN TEMP1 = MAX( SN / AAQQ, TEMP1 / AAPP ) * AAQQ = AAQQ*TEMP1 * AAPP = AAPP*TEMP1 ELSE IF( ( AAQQ.LE.SN ) .AND. ( AAPP.GE.TEMP1 ) ) THEN TEMP1 = MIN( SN / AAQQ, BIG / ( SQRT( REAL( N ) )*AAPP ) ) * AAQQ = AAQQ*TEMP1 * AAPP = AAPP*TEMP1 ELSE TEMP1 = ONE END IF * * Scale, if necessary * IF( TEMP1.NE.ONE ) THEN CALL SLASCL( 'G', 0, 0, ONE, TEMP1, N, 1, SVA, N, IERR ) END IF SKL = TEMP1*SKL IF( SKL.NE.ONE ) THEN CALL CLASCL( JOBA, 0, 0, ONE, SKL, M, N, A, LDA, IERR ) SKL = ONE / SKL END IF * * Row-cyclic Jacobi SVD algorithm with column pivoting * EMPTSW = ( N*( N-1 ) ) / 2 NOTROT = 0 DO 1868 q = 1, N CWORK( q ) = CONE 1868 CONTINUE * * * SWBAND = 3 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective * if CGESVJ is used as a computational routine in the preconditioned * Jacobi SVD algorithm CGEJSV. For sweeps i=1:SWBAND the procedure * works on pivots inside a band-like region around the diagonal. * The boundaries are determined dynamically, based on the number of * pivots above a threshold. * KBL = MIN( 8, N ) *[TP] KBL is a tuning parameter that defines the tile size in the * tiling of the p-q loops of pivot pairs. In general, an optimal * value of KBL depends on the matrix dimensions and on the * parameters of the computer's memory. * NBL = N / KBL IF( ( NBL*KBL ).NE.N )NBL = NBL + 1 * BLSKIP = KBL**2 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL. * ROWSKIP = MIN( 5, KBL ) *[TP] ROWSKIP is a tuning parameter. * LKAHEAD = 1 *[TP] LKAHEAD is a tuning parameter. * * Quasi block transformations, using the lower (upper) triangular * structure of the input matrix. The quasi-block-cycling usually * invokes cubic convergence. Big part of this cycle is done inside * canonical subspaces of dimensions less than M. * IF( ( LOWER .OR. UPPER ) .AND. ( N.GT.MAX( 64, 4*KBL ) ) ) THEN *[TP] The number of partition levels and the actual partition are * tuning parameters. N4 = N / 4 N2 = N / 2 N34 = 3*N4 IF( APPLV ) THEN q = 0 ELSE q = 1 END IF * IF( LOWER ) THEN * * This works very well on lower triangular matrices, in particular * in the framework of the preconditioned Jacobi SVD (xGEJSV). * The idea is simple: * [+ 0 0 0] Note that Jacobi transformations of [0 0] * [+ + 0 0] [0 0] * [+ + x 0] actually work on [x 0] [x 0] * [+ + x x] [x x]. [x x] * CALL CGSVJ0( JOBV, M-N34, N-N34, A( N34+1, N34+1 ), LDA, $ CWORK( N34+1 ), SVA( N34+1 ), MVL, $ V( N34*q+1, N34+1 ), LDV, EPSLN, SFMIN, TOL, $ 2, CWORK( N+1 ), LWORK-N, IERR ) CALL CGSVJ0( JOBV, M-N2, N34-N2, A( N2+1, N2+1 ), LDA, $ CWORK( N2+1 ), SVA( N2+1 ), MVL, $ V( N2*q+1, N2+1 ), LDV, EPSLN, SFMIN, TOL, 2, $ CWORK( N+1 ), LWORK-N, IERR ) CALL CGSVJ1( JOBV, M-N2, N-N2, N4, A( N2+1, N2+1 ), LDA, $ CWORK( N2+1 ), SVA( N2+1 ), MVL, $ V( N2*q+1, N2+1 ), LDV, EPSLN, SFMIN, TOL, 1, $ CWORK( N+1 ), LWORK-N, IERR ) * CALL CGSVJ0( JOBV, M-N4, N2-N4, A( N4+1, N4+1 ), LDA, $ CWORK( N4+1 ), SVA( N4+1 ), MVL, $ V( N4*q+1, N4+1 ), LDV, EPSLN, SFMIN, TOL, 1, $ CWORK( N+1 ), LWORK-N, IERR ) * CALL CGSVJ0( JOBV, M, N4, A, LDA, CWORK, SVA, MVL, V, $ LDV, $ EPSLN, SFMIN, TOL, 1, CWORK( N+1 ), LWORK-N, $ IERR ) * CALL CGSVJ1( JOBV, M, N2, N4, A, LDA, CWORK, SVA, MVL, V, $ LDV, EPSLN, SFMIN, TOL, 1, CWORK( N+1 ), $ LWORK-N, IERR ) * * ELSE IF( UPPER ) THEN * * CALL CGSVJ0( JOBV, N4, N4, A, LDA, CWORK, SVA, MVL, V, $ LDV, $ EPSLN, SFMIN, TOL, 2, CWORK( N+1 ), LWORK-N, $ IERR ) * CALL CGSVJ0( JOBV, N2, N4, A( 1, N4+1 ), LDA, $ CWORK( N4+1 ), $ SVA( N4+1 ), MVL, V( N4*q+1, N4+1 ), LDV, $ EPSLN, SFMIN, TOL, 1, CWORK( N+1 ), LWORK-N, $ IERR ) * CALL CGSVJ1( JOBV, N2, N2, N4, A, LDA, CWORK, SVA, MVL, $ V, $ LDV, EPSLN, SFMIN, TOL, 1, CWORK( N+1 ), $ LWORK-N, IERR ) * CALL CGSVJ0( JOBV, N2+N4, N4, A( 1, N2+1 ), LDA, $ CWORK( N2+1 ), SVA( N2+1 ), MVL, $ V( N2*q+1, N2+1 ), LDV, EPSLN, SFMIN, TOL, 1, $ CWORK( N+1 ), LWORK-N, IERR ) END IF * END IF * * .. Row-cyclic pivot strategy with de Rijk's pivoting .. * DO 1993 i = 1, NSWEEP * * .. go go go ... * MXAAPQ = ZERO MXSINJ = ZERO ISWROT = 0 * NOTROT = 0 PSKIPPED = 0 * * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs * 1 <= p < q <= N. This is the first step toward a blocked implementation * of the rotations. New implementation, based on block transformations, * is under development. * DO 2000 ibr = 1, NBL * igl = ( ibr-1 )*KBL + 1 * DO 1002 ir1 = 0, MIN( LKAHEAD, NBL-ibr ) * igl = igl + ir1*KBL * DO 2001 p = igl, MIN( igl+KBL-1, N-1 ) * * .. de Rijk's pivoting * q = ISAMAX( N-p+1, SVA( p ), 1 ) + p - 1 IF( p.NE.q ) THEN CALL CSWAP( M, A( 1, p ), 1, A( 1, q ), 1 ) IF( RSVEC )CALL CSWAP( MVL, V( 1, p ), 1, $ V( 1, q ), 1 ) TEMP1 = SVA( p ) SVA( p ) = SVA( q ) SVA( q ) = TEMP1 AAPQ = CWORK(p) CWORK(p) = CWORK(q) CWORK(q) = AAPQ END IF * IF( ir1.EQ.0 ) THEN * * Column norms are periodically updated by explicit * norm computation. *[!] Caveat: * Unfortunately, some BLAS implementations compute SCNRM2(M,A(1,p),1) * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold). * Hence, SCNRM2 cannot be trusted, not even in the case when * the true norm is far from the under(over)flow boundaries. * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF * below should be replaced with "AAPP = SCNRM2( M, A(1,p), 1 )". * IF( ( SVA( p ).LT.ROOTBIG ) .AND. $ ( SVA( p ).GT.ROOTSFMIN ) ) THEN SVA( p ) = SCNRM2( M, A( 1, p ), 1 ) ELSE TEMP1 = ZERO AAPP = ONE CALL CLASSQ( M, A( 1, p ), 1, TEMP1, AAPP ) SVA( p ) = TEMP1*SQRT( AAPP ) END IF AAPP = SVA( p ) ELSE AAPP = SVA( p ) END IF * IF( AAPP.GT.ZERO ) THEN * PSKIPPED = 0 * DO 2002 q = p + 1, MIN( igl+KBL-1, N ) * AAQQ = SVA( q ) * IF( AAQQ.GT.ZERO ) THEN * AAPP0 = AAPP IF( AAQQ.GE.ONE ) THEN ROTOK = ( SMALL*AAPP ).LE.AAQQ IF( AAPP.LT.( BIG / AAQQ ) ) THEN AAPQ = ( CDOTC( M, A( 1, p ), 1, $ A( 1, q ), 1 ) / AAQQ ) / AAPP ELSE CALL CCOPY( M, A( 1, p ), 1, $ CWORK(N+1), 1 ) CALL CLASCL( 'G', 0, 0, AAPP, ONE, $ M, 1, CWORK(N+1), LDA, IERR ) AAPQ = CDOTC( M, CWORK(N+1), 1, $ A( 1, q ), 1 ) / AAQQ END IF ELSE ROTOK = AAPP.LE.( AAQQ / SMALL ) IF( AAPP.GT.( SMALL / AAQQ ) ) THEN AAPQ = ( CDOTC( M, A( 1, p ), 1, $ A( 1, q ), 1 ) / AAPP ) / AAQQ ELSE CALL CCOPY( M, A( 1, q ), 1, $ CWORK(N+1), 1 ) CALL CLASCL( 'G', 0, 0, AAQQ, $ ONE, M, 1, $ CWORK(N+1), LDA, IERR ) AAPQ = CDOTC( M, A(1, p ), 1, $ CWORK(N+1), 1 ) / AAPP END IF END IF * * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q) AAPQ1 = -ABS(AAPQ) MXAAPQ = MAX( MXAAPQ, -AAPQ1 ) * * TO rotate or NOT to rotate, THAT is the question ... * IF( ABS( AAPQ1 ).GT.TOL ) THEN OMPQ = AAPQ / ABS(AAPQ) * * .. rotate *[RTD] ROTATED = ROTATED + ONE * IF( ir1.EQ.0 ) THEN NOTROT = 0 PSKIPPED = 0 ISWROT = ISWROT + 1 END IF * IF( ROTOK ) THEN * AQOAP = AAQQ / AAPP APOAQ = AAPP / AAQQ THETA = -HALF*ABS( AQOAP-APOAQ )/AAPQ1 * IF( ABS( THETA ).GT.BIGTHETA ) THEN * T = HALF / THETA CS = ONE CALL CROT( M, A(1,p), 1, A(1,q), $ 1, $ CS, CONJG(OMPQ)*T ) IF ( RSVEC ) THEN CALL CROT( MVL, V(1,p), 1, $ V(1,q), 1, CS, CONJG(OMPQ)*T ) END IF SVA( q ) = AAQQ*SQRT( MAX( ZERO, $ ONE+T*APOAQ*AAPQ1 ) ) AAPP = AAPP*SQRT( MAX( ZERO, $ ONE-T*AQOAP*AAPQ1 ) ) MXSINJ = MAX( MXSINJ, ABS( T ) ) * ELSE * * .. choose correct signum for THETA and rotate * THSIGN = -SIGN( ONE, AAPQ1 ) T = ONE / ( THETA+THSIGN* $ SQRT( ONE+THETA*THETA ) ) CS = SQRT( ONE / ( ONE+T*T ) ) SN = T*CS * MXSINJ = MAX( MXSINJ, ABS( SN ) ) SVA( q ) = AAQQ*SQRT( MAX( ZERO, $ ONE+T*APOAQ*AAPQ1 ) ) AAPP = AAPP*SQRT( MAX( ZERO, $ ONE-T*AQOAP*AAPQ1 ) ) * CALL CROT( M, A(1,p), 1, A(1,q), $ 1, $ CS, CONJG(OMPQ)*SN ) IF ( RSVEC ) THEN CALL CROT( MVL, V(1,p), 1, $ V(1,q), 1, CS, CONJG(OMPQ)*SN ) END IF END IF CWORK(p) = -CWORK(q) * OMPQ * ELSE * .. have to use modified Gram-Schmidt like transformation CALL CCOPY( M, A( 1, p ), 1, $ CWORK(N+1), 1 ) CALL CLASCL( 'G', 0, 0, AAPP, ONE, $ M, $ 1, CWORK(N+1), LDA, $ IERR ) CALL CLASCL( 'G', 0, 0, AAQQ, ONE, $ M, $ 1, A( 1, q ), LDA, IERR ) CALL CAXPY( M, -AAPQ, CWORK(N+1), 1, $ A( 1, q ), 1 ) CALL CLASCL( 'G', 0, 0, ONE, AAQQ, $ M, $ 1, A( 1, q ), LDA, IERR ) SVA( q ) = AAQQ*SQRT( MAX( ZERO, $ ONE-AAPQ1*AAPQ1 ) ) MXSINJ = MAX( MXSINJ, SFMIN ) END IF * END IF ROTOK THEN ... ELSE * * In the case of cancellation in updating SVA(q), SVA(p) * recompute SVA(q), SVA(p). * IF( ( SVA( q ) / AAQQ )**2.LE.ROOTEPS ) $ THEN IF( ( AAQQ.LT.ROOTBIG ) .AND. $ ( AAQQ.GT.ROOTSFMIN ) ) THEN SVA( q ) = SCNRM2( M, A( 1, q ), $ 1 ) ELSE T = ZERO AAQQ = ONE CALL CLASSQ( M, A( 1, q ), 1, T, $ AAQQ ) SVA( q ) = T*SQRT( AAQQ ) END IF END IF IF( ( AAPP / AAPP0 ).LE.ROOTEPS ) THEN IF( ( AAPP.LT.ROOTBIG ) .AND. $ ( AAPP.GT.ROOTSFMIN ) ) THEN AAPP = SCNRM2( M, A( 1, p ), 1 ) ELSE T = ZERO AAPP = ONE CALL CLASSQ( M, A( 1, p ), 1, T, $ AAPP ) AAPP = T*SQRT( AAPP ) END IF SVA( p ) = AAPP END IF * ELSE * A(:,p) and A(:,q) already numerically orthogonal IF( ir1.EQ.0 )NOTROT = NOTROT + 1 *[RTD] SKIPPED = SKIPPED + 1 PSKIPPED = PSKIPPED + 1 END IF ELSE * A(:,q) is zero column IF( ir1.EQ.0 )NOTROT = NOTROT + 1 PSKIPPED = PSKIPPED + 1 END IF * IF( ( i.LE.SWBAND ) .AND. $ ( PSKIPPED.GT.ROWSKIP ) ) THEN IF( ir1.EQ.0 )AAPP = -AAPP NOTROT = 0 GO TO 2103 END IF * 2002 CONTINUE * END q-LOOP * 2103 CONTINUE * bailed out of q-loop * SVA( p ) = AAPP * ELSE SVA( p ) = AAPP IF( ( ir1.EQ.0 ) .AND. ( AAPP.EQ.ZERO ) ) $ NOTROT = NOTROT + MIN( igl+KBL-1, N ) - p END IF * 2001 CONTINUE * end of the p-loop * end of doing the block ( ibr, ibr ) 1002 CONTINUE * end of ir1-loop * * ... go to the off diagonal blocks * igl = ( ibr-1 )*KBL + 1 * DO 2010 jbc = ibr + 1, NBL * jgl = ( jbc-1 )*KBL + 1 * * doing the block at ( ibr, jbc ) * IJBLSK = 0 DO 2100 p = igl, MIN( igl+KBL-1, N ) * AAPP = SVA( p ) IF( AAPP.GT.ZERO ) THEN * PSKIPPED = 0 * DO 2200 q = jgl, MIN( jgl+KBL-1, N ) * AAQQ = SVA( q ) IF( AAQQ.GT.ZERO ) THEN AAPP0 = AAPP * * .. M x 2 Jacobi SVD .. * * Safe Gram matrix computation * IF( AAQQ.GE.ONE ) THEN IF( AAPP.GE.AAQQ ) THEN ROTOK = ( SMALL*AAPP ).LE.AAQQ ELSE ROTOK = ( SMALL*AAQQ ).LE.AAPP END IF IF( AAPP.LT.( BIG / AAQQ ) ) THEN AAPQ = ( CDOTC( M, A( 1, p ), 1, $ A( 1, q ), 1 ) / AAQQ ) / AAPP ELSE CALL CCOPY( M, A( 1, p ), 1, $ CWORK(N+1), 1 ) CALL CLASCL( 'G', 0, 0, AAPP, $ ONE, M, 1, $ CWORK(N+1), LDA, IERR ) AAPQ = CDOTC( M, CWORK(N+1), 1, $ A( 1, q ), 1 ) / AAQQ END IF ELSE IF( AAPP.GE.AAQQ ) THEN ROTOK = AAPP.LE.( AAQQ / SMALL ) ELSE ROTOK = AAQQ.LE.( AAPP / SMALL ) END IF IF( AAPP.GT.( SMALL / AAQQ ) ) THEN AAPQ = ( CDOTC( M, A( 1, p ), 1, $ A( 1, q ), 1 ) / MAX(AAQQ,AAPP) ) $ / MIN(AAQQ,AAPP) ELSE CALL CCOPY( M, A( 1, q ), 1, $ CWORK(N+1), 1 ) CALL CLASCL( 'G', 0, 0, AAQQ, $ ONE, M, 1, $ CWORK(N+1), LDA, IERR ) AAPQ = CDOTC( M, A( 1, p ), 1, $ CWORK(N+1), 1 ) / AAPP END IF END IF * * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q) AAPQ1 = -ABS(AAPQ) MXAAPQ = MAX( MXAAPQ, -AAPQ1 ) * * TO rotate or NOT to rotate, THAT is the question ... * IF( ABS( AAPQ1 ).GT.TOL ) THEN OMPQ = AAPQ / ABS(AAPQ) NOTROT = 0 *[RTD] ROTATED = ROTATED + 1 PSKIPPED = 0 ISWROT = ISWROT + 1 * IF( ROTOK ) THEN * AQOAP = AAQQ / AAPP APOAQ = AAPP / AAQQ THETA = -HALF*ABS( AQOAP-APOAQ )/ AAPQ1 IF( AAQQ.GT.AAPP0 )THETA = -THETA * IF( ABS( THETA ).GT.BIGTHETA ) THEN T = HALF / THETA CS = ONE CALL CROT( M, A(1,p), 1, A(1,q), $ 1, $ CS, CONJG(OMPQ)*T ) IF( RSVEC ) THEN CALL CROT( MVL, V(1,p), 1, $ V(1,q), 1, CS, CONJG(OMPQ)*T ) END IF SVA( q ) = AAQQ*SQRT( MAX( ZERO, $ ONE+T*APOAQ*AAPQ1 ) ) AAPP = AAPP*SQRT( MAX( ZERO, $ ONE-T*AQOAP*AAPQ1 ) ) MXSINJ = MAX( MXSINJ, ABS( T ) ) ELSE * * .. choose correct signum for THETA and rotate * THSIGN = -SIGN( ONE, AAPQ1 ) IF( AAQQ.GT.AAPP0 )THSIGN = -THSIGN T = ONE / ( THETA+THSIGN* $ SQRT( ONE+THETA*THETA ) ) CS = SQRT( ONE / ( ONE+T*T ) ) SN = T*CS MXSINJ = MAX( MXSINJ, ABS( SN ) ) SVA( q ) = AAQQ*SQRT( MAX( ZERO, $ ONE+T*APOAQ*AAPQ1 ) ) AAPP = AAPP*SQRT( MAX( ZERO, $ ONE-T*AQOAP*AAPQ1 ) ) * CALL CROT( M, A(1,p), 1, A(1,q), $ 1, $ CS, CONJG(OMPQ)*SN ) IF( RSVEC ) THEN CALL CROT( MVL, V(1,p), 1, $ V(1,q), 1, CS, CONJG(OMPQ)*SN ) END IF END IF CWORK(p) = -CWORK(q) * OMPQ * ELSE * .. have to use modified Gram-Schmidt like transformation IF( AAPP.GT.AAQQ ) THEN CALL CCOPY( M, A( 1, p ), 1, $ CWORK(N+1), 1 ) CALL CLASCL( 'G', 0, 0, AAPP, $ ONE, $ M, 1, CWORK(N+1),LDA, $ IERR ) CALL CLASCL( 'G', 0, 0, AAQQ, $ ONE, $ M, 1, A( 1, q ), LDA, $ IERR ) CALL CAXPY( M, -AAPQ, CWORK(N+1), $ 1, A( 1, q ), 1 ) CALL CLASCL( 'G', 0, 0, ONE, $ AAQQ, $ M, 1, A( 1, q ), LDA, $ IERR ) SVA( q ) = AAQQ*SQRT( MAX( ZERO, $ ONE-AAPQ1*AAPQ1 ) ) MXSINJ = MAX( MXSINJ, SFMIN ) ELSE CALL CCOPY( M, A( 1, q ), 1, $ CWORK(N+1), 1 ) CALL CLASCL( 'G', 0, 0, AAQQ, $ ONE, $ M, 1, CWORK(N+1),LDA, $ IERR ) CALL CLASCL( 'G', 0, 0, AAPP, $ ONE, $ M, 1, A( 1, p ), LDA, $ IERR ) CALL CAXPY( M, -CONJG(AAPQ), $ CWORK(N+1), 1, A( 1, p ), 1 ) CALL CLASCL( 'G', 0, 0, ONE, $ AAPP, $ M, 1, A( 1, p ), LDA, $ IERR ) SVA( p ) = AAPP*SQRT( MAX( ZERO, $ ONE-AAPQ1*AAPQ1 ) ) MXSINJ = MAX( MXSINJ, SFMIN ) END IF END IF * END IF ROTOK THEN ... ELSE * * In the case of cancellation in updating SVA(q), SVA(p) * .. recompute SVA(q), SVA(p) IF( ( SVA( q ) / AAQQ )**2.LE.ROOTEPS ) $ THEN IF( ( AAQQ.LT.ROOTBIG ) .AND. $ ( AAQQ.GT.ROOTSFMIN ) ) THEN SVA( q ) = SCNRM2( M, A( 1, q ), $ 1) ELSE T = ZERO AAQQ = ONE CALL CLASSQ( M, A( 1, q ), 1, T, $ AAQQ ) SVA( q ) = T*SQRT( AAQQ ) END IF END IF IF( ( AAPP / AAPP0 )**2.LE.ROOTEPS ) THEN IF( ( AAPP.LT.ROOTBIG ) .AND. $ ( AAPP.GT.ROOTSFMIN ) ) THEN AAPP = SCNRM2( M, A( 1, p ), 1 ) ELSE T = ZERO AAPP = ONE CALL CLASSQ( M, A( 1, p ), 1, T, $ AAPP ) AAPP = T*SQRT( AAPP ) END IF SVA( p ) = AAPP END IF * end of OK rotation ELSE NOTROT = NOTROT + 1 *[RTD] SKIPPED = SKIPPED + 1 PSKIPPED = PSKIPPED + 1 IJBLSK = IJBLSK + 1 END IF ELSE NOTROT = NOTROT + 1 PSKIPPED = PSKIPPED + 1 IJBLSK = IJBLSK + 1 END IF * IF( ( i.LE.SWBAND ) .AND. ( IJBLSK.GE.BLSKIP ) ) $ THEN SVA( p ) = AAPP NOTROT = 0 GO TO 2011 END IF IF( ( i.LE.SWBAND ) .AND. $ ( PSKIPPED.GT.ROWSKIP ) ) THEN AAPP = -AAPP NOTROT = 0 GO TO 2203 END IF * 2200 CONTINUE * end of the q-loop 2203 CONTINUE * SVA( p ) = AAPP * ELSE * IF( AAPP.EQ.ZERO )NOTROT = NOTROT + $ MIN( jgl+KBL-1, N ) - jgl + 1 IF( AAPP.LT.ZERO )NOTROT = 0 * END IF * 2100 CONTINUE * end of the p-loop 2010 CONTINUE * end of the jbc-loop 2011 CONTINUE *2011 bailed out of the jbc-loop DO 2012 p = igl, MIN( igl+KBL-1, N ) SVA( p ) = ABS( SVA( p ) ) 2012 CONTINUE *** 2000 CONTINUE *2000 :: end of the ibr-loop * * .. update SVA(N) IF( ( SVA( N ).LT.ROOTBIG ) .AND. ( SVA( N ).GT.ROOTSFMIN ) ) $ THEN SVA( N ) = SCNRM2( M, A( 1, N ), 1 ) ELSE T = ZERO AAPP = ONE CALL CLASSQ( M, A( 1, N ), 1, T, AAPP ) SVA( N ) = T*SQRT( AAPP ) END IF * * Additional steering devices * IF( ( i.LT.SWBAND ) .AND. ( ( MXAAPQ.LE.ROOTTOL ) .OR. $ ( ISWROT.LE.N ) ) )SWBAND = i * IF( ( i.GT.SWBAND+1 ) .AND. ( MXAAPQ.LT.SQRT( REAL( N ) )* $ TOL ) .AND. ( REAL( N )*MXAAPQ*MXSINJ.LT.TOL ) ) THEN GO TO 1994 END IF * IF( NOTROT.GE.EMPTSW )GO TO 1994 * 1993 CONTINUE * end i=1:NSWEEP loop * * #:( Reaching this point means that the procedure has not converged. INFO = NSWEEP - 1 GO TO 1995 * 1994 CONTINUE * #:) Reaching this point means numerical convergence after the i-th * sweep. * INFO = 0 * #:) INFO = 0 confirms successful iterations. 1995 CONTINUE * * Sort the singular values and find how many are above * the underflow threshold. * N2 = 0 N4 = 0 DO 5991 p = 1, N - 1 q = ISAMAX( N-p+1, SVA( p ), 1 ) + p - 1 IF( p.NE.q ) THEN TEMP1 = SVA( p ) SVA( p ) = SVA( q ) SVA( q ) = TEMP1 CALL CSWAP( M, A( 1, p ), 1, A( 1, q ), 1 ) IF( RSVEC )CALL CSWAP( MVL, V( 1, p ), 1, V( 1, q ), 1 ) END IF IF( SVA( p ).NE.ZERO ) THEN N4 = N4 + 1 IF( SVA( p )*SKL.GT.SFMIN )N2 = N2 + 1 END IF 5991 CONTINUE IF( SVA( N ).NE.ZERO ) THEN N4 = N4 + 1 IF( SVA( N )*SKL.GT.SFMIN )N2 = N2 + 1 END IF * * Normalize the left singular vectors. * IF( LSVEC .OR. UCTOL ) THEN DO 1998 p = 1, N4 * CALL CSSCAL( M, ONE / SVA( p ), A( 1, p ), 1 ) CALL CLASCL( 'G',0,0, SVA(p), ONE, M, 1, A(1,p), M, $ IERR ) 1998 CONTINUE END IF * * Scale the product of Jacobi rotations. * IF( RSVEC ) THEN DO 2399 p = 1, N TEMP1 = ONE / SCNRM2( MVL, V( 1, p ), 1 ) CALL CSSCAL( MVL, TEMP1, V( 1, p ), 1 ) 2399 CONTINUE END IF * * Undo scaling, if necessary (and possible). IF( ( ( SKL.GT.ONE ) .AND. ( SVA( 1 ).LT.( BIG / SKL ) ) ) $ .OR. ( ( SKL.LT.ONE ) .AND. ( SVA( MAX( N2, 1 ) ) .GT. $ ( SFMIN / SKL ) ) ) ) THEN DO 2400 p = 1, N SVA( P ) = SKL*SVA( P ) 2400 CONTINUE SKL = ONE END IF * RWORK( 1 ) = SKL * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE * then some of the singular values may overflow or underflow and * the spectrum is given in this factored representation. * RWORK( 2 ) = REAL( N4 ) * N4 is the number of computed nonzero singular values of A. * RWORK( 3 ) = REAL( N2 ) * N2 is the number of singular values of A greater than SFMIN. * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers * that may carry some information. * RWORK( 4 ) = REAL( i ) * i is the index of the last sweep before declaring convergence. * RWORK( 5 ) = MXAAPQ * MXAAPQ is the largest absolute value of scaled pivots in the * last sweep * RWORK( 6 ) = MXSINJ * MXSINJ is the largest absolute value of the sines of Jacobi angles * in the last sweep * RETURN * .. * .. END OF CGESVJ * .. END *