2021-12-16
§
|
11:08 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2003.codfw.wmnet with OS buster |
[production] |
10:59 |
<btullis> |
pushed new packages for druid version 0.19.0-2 on buster using reprepro |
[production] |
10:31 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reimage for host kafka-main2003.codfw.wmnet with OS buster |
[production] |
10:28 |
<elukey> |
second attempt to reimage kafka-main2003 to buster |
[production] |
10:09 |
<moritzm> |
drain primary/secondary instances off ganeti2007 T296622 |
[production] |
10:04 |
<moritzm> |
switched kubetcd2004 to DRBD-based storage to allow migration for reimages |
[production] |
09:50 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage |
[production] |
09:50 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage |
[production] |
09:46 |
<moritzm> |
added ganeti2028 to ganeti codfw cluster T294139 |
[production] |
09:38 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage |
[production] |
09:38 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage |
[production] |
09:11 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2024.codfw.wmnet with OS buster |
[production] |
08:55 |
<moritzm> |
drain primary/secondary instances off ganeti2015 T296622 |
[production] |
08:43 |
<moritzm> |
switch ml-etcd2003 to DRBD-based storage to allow migration for reimages |
[production] |
08:39 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
08:32 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
08:31 |
<urbanecm@deploy1002> |
Synchronized php-1.38.0-wmf.13/extensions/GrowthExperiments/: 35c055cead3d240625b76d21aa4e685525ca0d4b: MentorPageMentorManager: Do not fail hard with no mentor list configured (T297827) (duration: 01m 09s) |
[production] |
08:20 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reimage for host ganeti2024.codfw.wmnet with OS buster |
[production] |
08:07 |
<dcausse> |
restart blazegraph on wdqs1013 (jvm stuck for 4hours) |
[production] |
03:27 |
<hoo> |
Stopped rebuildItemsPerSite on mwmaint1002 (was slightly beyond item Q72056756), as it has a memory leak (and would OOM in a few days) |
[production] |
01:53 |
<mutante> |
miscweb1002 / miscweb2002 - both backends 'PASS: 26 requests sent to miscweb1002.eqiad.wmnet. All assertions passed.' again after fixing httpbb tests and T297605 |
[production] |
01:50 |
<mutante> |
miscweb1002 - re-enabling puppet after deployment for T297605 |
[production] |
01:03 |
<legoktm> |
removing current dump from static-codereview to replace it with a new one |
[production] |
00:37 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
00:30 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
00:22 |
<legoktm> |
upgraded php7.2 on mw1414 for mysqlnd memory leak fix part 2 (T297667) |
[production] |
00:19 |
<legoktm> |
uploaded 7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf5 to buster-wikimedia for T297667 |
[production] |
00:18 |
<ejegg> |
updated payments-wiki from df3ded67 to 55e605dd |
[production] |
00:16 |
<mutante> |
miscweb1002 - disable puppet, deploying gerrit:747600 on miscweb2002 first, indeed puppet problem detected T297605 |
[production] |
00:05 |
<legoktm> |
published new versions of php7.{2,4}-fpm-multiversion-base image with php-yaml extension (T296331) |
[production] |
2021-12-15
§
|
23:38 |
<milimetric@deploy1002> |
Finished deploy [analytics/refinery@0d74de0] (thin): Pushing 0.1.23 for SparkSQLNCLIDriver job (THIN) (duration: 00m 07s) |
[production] |
23:37 |
<milimetric@deploy1002> |
Started deploy [analytics/refinery@0d74de0] (thin): Pushing 0.1.23 for SparkSQLNCLIDriver job (THIN) |
[production] |
23:26 |
<milimetric@deploy1002> |
Finished deploy [analytics/refinery@0d74de0]: Pushing 0.1.23 for SparkSQLNCLIDriver job (duration: 15m 35s) |
[production] |
23:10 |
<milimetric@deploy1002> |
Started deploy [analytics/refinery@0d74de0]: Pushing 0.1.23 for SparkSQLNCLIDriver job |
[production] |
23:10 |
<milimetric@deploy1002> |
Started deploy [analytics/refinery@0d74de0]: Pushing 0.1.23 for SparkSQLNCLIDriver job |
[production] |
22:50 |
<legoktm> |
installing php-yaml on parsoid, jobrunners and maint servers |
[production] |
20:52 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:52 |
<hashar@deploy1002> |
Synchronized php-1.38.0-wmf.13/includes/skins/Skin.php: Remove migration script - T297484 (duration: 01m 06s) |
[production] |
20:51 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:46 |
<hashar@deploy1002> |
rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.38.0-wmf.13 refs T293954 |
[production] |
20:45 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:44 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:12 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:11 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:05 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:04 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
20:04 |
<hashar@deploy1002> |
Synchronized php: group1 wikis to 1.38.0-wmf.13 refs T293954 (duration: 01m 05s) |
[production] |
20:03 |
<hashar@deploy1002> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.13 refs T293954 |
[production] |
19:52 |
<pt1979@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-test-coord1002.eqiad.wmnet with OS buster |
[production] |
19:45 |
<pt1979@cumin1001> |
START - Cookbook sre.hosts.reimage for host an-test-coord1002.eqiad.wmnet with OS buster |
[production] |