2020-06-05
ยง
|
22:18 |
<MacFan4000> |
restarting for code and config changes |
[tools.zppixbot] |
21:58 |
<MacFan4000> |
restarting for code and config changes |
[tools.zppixbot-test] |
21:56 |
<Texas> |
chmod o-r default.cfg |
[tools.zppixbot] |
21:53 |
<Texas> |
chmod a-r default.cfg |
[tools.zppixbot] |
21:52 |
<Krinkle> |
Add Cam11598 to cloud admins for cvn.wmflabs.org |
[cvn] |
21:46 |
<Cam11598> |
soft reboot cvn-app9 |
[cvn] |
21:27 |
<Texas> |
kubectl delete pods --all |
[tools.zppixbot] |
20:08 |
<Cam11598> |
restarted all bots due to ping timeout |
[cvn] |
18:01 |
<Urbanecm> |
jstart -N stewardbot -mem 2G /data/project/stewardbots/venv-py3/bin/python3 -u /data/project/stewardbots/stewardbots/StewardBot/StewardBot.py |
[tools.stewardbots] |
17:59 |
<Urbanecm> |
Kill stewardbot job, so I can debug from commandline |
[tools.stewardbots] |
17:56 |
<elukey> |
roll restart presto server on an-presto* to pick up new openjdk upgrades |
[analytics] |
17:54 |
<Urbanecm> |
Restart stewardbot IRC bot |
[tools.stewardbots] |
16:45 |
<elukey@deploy1001> |
Finished deploy [analytics/turnilo/deploy@f7e4f78]: Upgrade to 1.24.0 (duration: 00m 11s) |
[production] |
16:45 |
<elukey> |
upgrade turnilo to 1.24.0 |
[analytics] |
16:45 |
<elukey@deploy1001> |
Started deploy [analytics/turnilo/deploy@f7e4f78]: Upgrade to 1.24.0 |
[production] |
16:30 |
<bd808> |
Hard restart for LDAP session termination issues |
[tools.stashbot] |
16:29 |
<bd808> |
Testing stashbot following hard restart of service. It was having LDAP connection failure problems. |
[production] |
16:28 |
<Operator873> |
11:27 CVNBot15 restart |
[cvn] |
16:00 |
<AndyRussG> |
Turned off Fundraising job recurring_smashpig_charge |
[production] |
15:54 |
<cdanis> |
enabling & rerunning puppet on netflow* T254574 |
[production] |
15:43 |
<hashar> |
Building image docker-registry.discovery.wmnet/releng/operations-puppet:0.7.2 https://gerrit.wikimedia.org/r/#/c/integration/config/+/602710/ |
[releng] |
15:39 |
<cdanis> |
disabling puppet on netflow* and trying I6598d8f8 on netflow3001 first T254574 |
[production] |
15:39 |
<cdanis> |
disabling puppet on netflow* and trying I6598d8f8 on netflow3001 first |
[production] |
15:08 |
<andrewbogott> |
trying to re-enable puppet without losing cumin contact, as per https://phabricator.wikimedia.org/T254589 |
[admin] |
13:33 |
<jayme@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'mathoid' for release 'staging' . |
[production] |
13:26 |
<elukey> |
reimage druid1006 to debian buster |
[analytics] |
13:19 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' . |
[production] |
13:19 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
13:19 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' . |
[production] |
13:18 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
13:15 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
12:55 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/interwiki.php: Hotfix for be-tarask interwiki link being broken (T111853) (duration: 01m 00s) |
[production] |
12:41 |
<mutante> |
rebooting gerrit1002 to add more vCPUs, after [ganeti1009:~] $ sudo gnt-instance modify -B vcpus=8 gerrit1002.wikimedia.org T239151 |
[production] |
12:20 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'zotero' for release 'staging' . |
[production] |
12:19 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . |
[production] |
12:19 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' . |
[production] |
12:19 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' . |
[production] |
12:19 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
12:17 |
<akosiaris> |
update blubberoid changeprop changeprop-jobqueue citoid cxserver wikifeeds zotero in staging to latest charts |
[production] |
12:17 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . |
[production] |
12:17 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . |
[production] |
12:17 |
<akosiaris> |
fix typo in ganeti2016 /etc/network/interfaces and reboot |
[production] |
11:28 |
<akosiaris> |
master-failover from ganeti2001 to ganeti2019 for ganeti01.svc.codfw.wmnet |
[production] |
11:25 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
11:25 |
<akosiaris@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
11:25 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
11:14 |
<mutante> |
running puppet on all ganeti nodes |
[production] |
11:05 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 14s) |
[production] |
10:32 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) |
[production] |
10:11 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) |
[production] |