2014-09-15
§
|
22:53 |
<jeremyb> |
deployment-pdf01: pkill -f grain-ensure |
[releng] |
21:44 |
<andrewbogott> |
migrating deployment-videoscaler01 to virt1002 |
[releng] |
21:41 |
<andrewbogott> |
migrating deployment-sentry2 to virt1002 |
[releng] |
21:40 |
<cscott> |
*skipped* deploy of OCG, due to deployment-salt issues |
[releng] |
21:36 |
<bd808> |
Trying to fix salt with `salt '*' service.restart salt-minion` |
[releng] |
21:32 |
<bd808> |
only hosts responding to salt in beta are deployment-mathoid, deployment-pdf01 and deployment-stream |
[releng] |
21:29 |
<bd808> |
salt calls failing in beta with errors like "This master address: 'salt' was previously resolvable but now fails to resolve!" |
[releng] |
21:19 |
<bd808> |
Added Matanya to under_NDA sudoers group (bug 70864) |
[releng] |
20:18 |
<hashar> |
restarted salt-master |
[releng] |
19:50 |
<hashar> |
killed on deployment-bastion a bunch of <tt>python /usr/local/sbin/grain-ensure contains ... </tt> and <tt>/usr/bin/python /usr/bin/salt-call --out=json grains.append deployment_target scap</tt> commands |
[releng] |
18:57 |
<hashar> |
scap breakage due to ferm is logged as https://bugzilla.wikimedia.org/show_bug.cgi?id=70858 |
[releng] |
18:48 |
<hashar> |
https://gerrit.wikimedia.org/r/#/c/160485/ tweaked a default ferm configuration file which caused puppet to reload ferm. It ends up having rules that prevent ssh from other host thus breaking rsync \\O/ |
[releng] |
18:37 |
<hashar> |
beta-scap-eqiad job is broken since ~17:20 UTC https://integration.wikimedia.org/ci/job/beta-scap-eqiad/21680/console || rsync: failed to connect to deployment-bastion.eqiad.wmflabs (10.68.16.58): Connection timed out (110) |
[releng] |
2014-09-11
§
|
20:59 |
<spagewmf> |
https://integration.wikimedia.org/ci/ is down with 503 errors |
[releng] |
16:31 |
<YuviPanda> |
Delete deployment-graphite instance |
[releng] |
16:13 |
<bd808> |
Now that scap is pointed to labmon1001.eqiad.wmnet the deployment-graphite.eqiad.wmflabs host can probably be deleted; it never really worked anyway |
[releng] |
16:12 |
<bd808> |
Updated scap to include I0f7f5cae72a87f68d861340d11632fb429c557b9 |
[releng] |
15:09 |
<bd808> |
Updated hhvm-luasandbox to latest version on mediawiki03 and verified that mediawiki0[12] were already updated |
[releng] |
15:01 |
<bd808> |
Fixed incorrect $::deployment_server_override var on deployment-videoscaler01; deployment-bastion.eqiad.wmflabs is correct and deployment-salt.eqiad.wmflabs is not |
[releng] |
10:05 |
<ori> |
deployment-prep upgraded luasandbox and hhvm across the cluster |
[releng] |
08:41 |
<spagewmf> |
deployment-mediawiki01/02 are not getting latest code |
[releng] |
05:10 |
<bd808> |
Reverted cherry-pick of I621d14e4b75a8415b16077fb27ca956c4de4c4c3 in scap; not the actual problem |
[releng] |
05:02 |
<bd808> |
Cherry-picked I621d14e4b75a8415b16077fb27ca956c4de4c4c3 to scap to try and fix l10n update issue |
[releng] |
02:29 |
<mutante> |
raised instance quota by 1 to 42 |
[releng] |
2014-09-10
§
|
19:38 |
<bd808> |
Fixed beta-recompile-math-texvc-eqiad job on deployment-bastion |
[releng] |
19:38 |
<bd808> |
Made /usr/local/apache/common-local a symlink to /srv/mediawiki on deployment-bastion |
[releng] |
19:37 |
<bd808> |
Deleted old /srv/common-local on deployment-videoscaler01 |
[releng] |
19:32 |
<bd808> |
Killed jobs-loop.sh tasks on deployment-jobrunner01 |
[releng] |
19:30 |
<bd808> |
Removed old mw-job-runner cron job on deployment-jobrunner01 |
[releng] |
19:19 |
<bd808> |
Deleted /var/log/account/pacct* and /var/log/atop.log.* on deployment-jobrunner01 to make some temporary room in /var |
[releng] |
19:14 |
<bd808> |
Deleted /var/log/mediawiki/jobrunner.log and restarted jobrunner on deployment-jobrunner01: |
[releng] |
19:11 |
<bd808> |
/var full on deployment-jobrunner01 |
[releng] |