2020-01-30
§
|
21:04 |
<andrewbogott> |
also apt-get install python3-novaclient on tools-prometheus-03 and tools-prometheus-04 to suppress cronspam. Possible real fix for this is https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/569084/ |
[tools] |
20:39 |
<andrewbogott> |
apt-get install python3-keystoneclient on tools-prometheus-03 and tools-prometheus-04 to suppress cronspam. Possible real fix for this is https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/569084/ |
[tools] |
19:37 |
<mutante> |
copying /var/log/apache2 to /root on all eqiad mw appservers to preserve logs |
[production] |
18:07 |
<vgutierrez> |
depool cp4032 and perform a rolling restart of varnish-fe at cp4027-cp4031 - T243634 |
[production] |
17:51 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.16/extensions/Wikibase/lib/includes/Store/Sql/Terms/FingerprintableEntityTermStoreTrait.php: wbterms: Fix incorrect deletion of rows in findActuallyUnusedTermIds (T243944) (duration: 01m 06s) |
[production] |
17:49 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.16/extensions/Wikibase/repo/maintenance/rebuildItemTerms.php: wbterms: Write only to the new term store in rebuildItemTerms (T243944) (duration: 01m 09s) |
[production] |
17:03 |
<vgutierrez> |
repooling cp4032 - T243634 |
[production] |
17:02 |
<vgutierrez> |
restarting varnish-frontend on cp4031 before it crashes - T243634 |
[production] |
16:27 |
<arturo> |
create VM tools-prometheus-04 as cold standby of tools-prometheus-03 (T238096) |
[tools] |
16:26 |
<vgutierrez> |
manually refreshing OCSP stapling response for non-canonical-redirects-3 - T243948 |
[production] |
16:25 |
<arturo> |
point tools-prometheus.wmflabs.org proxy to tools-prometheus-03 (T238096) |
[tools] |
13:42 |
<arturo> |
disable puppet in prometheus servers while syncing metric data (T238096) |
[tools] |
13:14 |
<arturo> |
drop floating IP 185.15.56.60 and FQDN `prometheus.tools.wmcloud.org` because this is not how the prometheus setup is right now. Use a web proxy instead `tools-prometheus-new.wmflabs.org` (T238096) |
[tools] |
13:09 |
<arturo> |
created FQDN `prometheus.tools.wmcloud.org` pointing to IPv4 185.15.56.60 (tools-prometheus-03) to test T238096 |
[tools] |
12:59 |
<arturo> |
associated floating IPv4 185.15.56.60 to tools-prometheus-03 (T238096) |
[tools] |
12:57 |
<arturo> |
created domain `tools.wmcloud.org` in the tools project after some back and forth with designated, permissions and the database. I plan to use this domain to test the new Debian Buster-based prometheus setup (T238096) |
[tools] |
12:22 |
<arturo> |
add prometheus 2.7.1+ds-3+k8s+buster to buster-wikimedia T238096 (basically a rebuild from stretch) |
[production] |
10:20 |
<arturo> |
create new VM instance tools-prometheus-03 (T238096) |
[tools] |
06:23 |
<vgutierrez> |
restarting varnish-frontend on cp4030 before it crashes - T243634 |
[production] |
06:21 |
<vgutierrez> |
depool cp4032 - T243634 |
[production] |
05:12 |
<vgutierrez> |
restarting varnish-frontend and repooling cp4029 - T243634 |
[production] |
05:00 |
<vgutierrez> |
depooling cp4029 |
[production] |
2020-01-29
§
|
23:55 |
<James_F> |
zuul: [data-values/value-view] Migrate to node10, now that that passes T228453 |
[releng] |
23:37 |
<marostegui> |
Remove partitions from db2087:3317 - T239453 |
[production] |
20:56 |
<Krinkle> |
Reloading Zuul to deploy https://gerrit.wikimedia.org/r/568600 |
[releng] |
20:07 |
<bd808> |
Created {bastion,login,dev}.toolforge.org service names for Toolforge bastions using Horizon & Designate |
[tools] |
18:17 |
<XioNoX> |
move knams netflow sampling to cr3-knams |
[production] |
17:47 |
<arturo> |
delete VM arturo-ocata-test no longer in use |
[openstack] |
17:19 |
<krinkle@deploy1001> |
Synchronized wmf-config/etcd.php: Ice8dad2 (duration: 01m 10s) |
[production] |
14:20 |
<arturo> |
created and deleted a couple of test VMS for https://gerrit.wikimedia.org/r/c/operations/puppet/+/568473 |
[openstack] |
01:11 |
<vgutierrez> |
varnish-frontend restarted on cp4031 |
[production] |
01:09 |
<vgutierrez> |
repool cp4031 |
[production] |
01:05 |
<marostegui> |
Disable notifications for dbstore1005:3318 slave lag - T243871 |
[production] |
01:03 |
<vgutierrez> |
depool cp4031 |
[production] |
00:35 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1097:3314 T239453', diff saved to https://phabricator.wikimedia.org/P10289 and previous config saved to /var/cache/conftool/dbconfig/20200129-003507-marostegui.json |
[production] |
00:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1097:3314 T239453', diff saved to https://phabricator.wikimedia.org/P10288 and previous config saved to /var/cache/conftool/dbconfig/20200129-002203-marostegui.json |
[production] |
00:05 |
<James_F> |
layout: [mediawiki/extension/NSFileRepo] Restore tests now they're fixed T196480 |
[releng] |
2020-01-28
§
|
23:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1097:3314 T239453', diff saved to https://phabricator.wikimedia.org/P10287 and previous config saved to /var/cache/conftool/dbconfig/20200128-235336-marostegui.json |
[production] |
23:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1097:3314 T239453', diff saved to https://phabricator.wikimedia.org/P10286 and previous config saved to /var/cache/conftool/dbconfig/20200128-234601-marostegui.json |
[production] |
23:42 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Start repooling db1084 with its original weight', diff saved to https://phabricator.wikimedia.org/P10285 and previous config saved to /var/cache/conftool/dbconfig/20200128-234219-marostegui.json |
[production] |
23:40 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1121 T232446', diff saved to https://phabricator.wikimedia.org/P10284 and previous config saved to /var/cache/conftool/dbconfig/20200128-234037-marostegui.json |
[production] |
17:24 |
<arturo> |
[codfw1dev] root@cloudcontrol2001-dev:~# designate server-create --name ns0.openstack.codfw1dev.wikimediacloud.org. (T243766) |
[admin] |
15:06 |
<addshore> |
Start addshore@mwmaint1002:~$ ./T219123.sh # Taking over from @ladsgroup for T219123 |
[production] |
13:48 |
<arturo> |
crontab jobs activated again |
[tools.jarbot] |
13:35 |
<arturo> |
`aborrero@tools-clushmaster-02:~$ clush -w @exec-stretch 'for i in $(ps aux | grep [t]ools.j | awk -F" " "{print \$2}") ; do echo "killing $i" ; sudo kill $i ; done || true'` (T243831) |
[tools] |
11:18 |
<arturo> |
disabled all cronjobs per request from WMF SRE team: https://en.wikipedia.org/w/index.php?title=User_talk%3AJarBot&type=revision&diff=937974097&oldid=719916908 |
[tools.jarbot] |
11:15 |
<arturo> |
stopped grid jobs per request from WMF SRE team: https://en.wikipedia.org/w/index.php?title=User_talk%3AJarBot&type=revision&diff=937974097&oldid=719916908 |
[tools.jarbot] |
10:18 |
<arturo> |
[codfw1dev] created DNS record `bastion-codfw1dev-01.codfw1dev.wmcloud.org A 185.15.57.2` (T242976, T229441) |
[admin] |
10:13 |
<arturo> |
[codfw1dev] the zone `codfw1dev.wmcloud.org` belongs now to the `cloudinfra-codfw1dev` project (T242976) |
[admin] |
10:11 |
<arturo> |
[codfw1dev] `root@cloudcontrol2001-dev:~# openstack zone create --description "main DNS domain for public addresses" --email "root@wmflabs.org" --type PRIMARY --ttl 3600 codfw1dev.wmcloud.org.` (T242976 and T243766) |
[admin] |