2021-10-04
§
|
11:16 |
<lucaswerkmeister-wmde@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720058|Add IA-Upload tool domains to Commons wgCopyUploadsDomains (T287241)]] (duration: 00m 59s) |
[production] |
11:12 |
<akosiaris@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mathoid' for release 'production' . |
[production] |
11:10 |
<akosiaris@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'production' . |
[production] |
11:07 |
<jiji@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
11:06 |
<akosiaris@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' . |
[production] |
11:04 |
<effie> |
depool wtp1026 for tests |
[production] |
11:04 |
<effie> |
pool wtp1025 |
[production] |
10:59 |
<jiji@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
09:13 |
<akosiaris> |
hbal -L -G row_C -X on ganeti01.svc.eqiad.wmnet |
[production] |
08:59 |
<jgiannelos@deploy1002> |
Finished deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad (duration: 00m 54s) |
[production] |
08:58 |
<jgiannelos@deploy1002> |
Started deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad |
[production] |
07:43 |
<joal> |
Kill-restart mediawiki-history-reduced job after deploy (more ressources) |
[analytics] |
07:37 |
<joal@deploy1002> |
Finished deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc] (duration: 06m 14s) |
[production] |
07:32 |
<joal> |
Deploy refinery to hdfs |
[analytics] |
07:31 |
<joal@deploy1002> |
Started deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc] |
[production] |
07:30 |
<joal@deploy1002> |
Finished deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc] (duration: 00m 06s) |
[production] |
07:30 |
<joal@deploy1002> |
Started deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc] |
[production] |
07:29 |
<joal@deploy1002> |
Finished deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc] (duration: 19m 18s) |
[production] |
07:19 |
<dcausse> |
restarting blazegraph on wdqs2001 & wdqs2004 (allocators burning too quickly) |
[production] |
07:18 |
<elukey> |
depool + restart blazegraph + restart updater for wdqs1006 |
[production] |
07:18 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=inactive; selector: name=wdqs1006.wmnet |
[production] |
07:18 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=inactive; selector: name=wdqs1004.wmnet |
[production] |
07:10 |
<joal@deploy1002> |
Started deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc] |
[production] |
07:10 |
<joal> |
Deploy refinery for mediawiki-history-reduced hotfix |
[analytics] |
07:02 |
<godog> |
swift eqiad-prod: add weight to ms-be10[64-67] - T290546 |
[production] |
06:56 |
<joal> |
Kill-restart pageview-monthly_dump-coord to apply fix for SLA |
[analytics] |
06:44 |
<elukey> |
depool + restart blazegraph + restart updater on wdqs1004 |
[production] |
05:50 |
<ladsgroup@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
05:49 |
<ladsgroup@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
05:47 |
<ladsgroup@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
2021-10-03
§
|
21:30 |
<bstorm> |
rebuilding buster containers since they are also affected T291387 T292355 |
[tools] |
21:29 |
<bstorm> |
rebuilt stretch containers for potential issues with LE cert updates T291387 |
[tools] |
18:35 |
<chicocvenancio> |
building python 2.7.18 to use in python 3.9 container T292355 |
[tools.video2commons] |
17:28 |
<Operator873|CVN> |
restarted bots 5, 12, 28, and 29 failed to regain nick. |
[cvn] |
14:45 |
<_joe_> |
restarting acmechief on acmechief1001 |
[production] |
12:55 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Depool db1127, bad ram', diff saved to https://phabricator.wikimedia.org/P17414 and previous config saved to /var/cache/conftool/dbconfig/20211003-125530-kormat.json |
[production] |
12:02 |
<majavah> |
update to python 3.9, after it broke when due to recent LE changes, was using python 3.4 / jessie |
[tools.sge-jobs] |
08:24 |
<elukey> |
powercycle cp5006 (unresponsive to ssh, remote tty available but not able to login as root, no prometheus metrics in hours) |
[production] |
08:23 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet |
[production] |